This Page Is Inserted by IFW Operations 
and is not a part of the Official Record 

BEST AVAILABLE IMAGES 

Defective images within this document are accurate representations of 
the original documents submitted by the applicant. 

Defects in the images may include (but are not limited to): 

• BLACK BORDERS 

• TEXT CUT OFF AT TOP, BOTTOM OR SIDES 

• FADED TEXT 

• ILLEGIBLE TEXT 

• SKEWED/SLANTED IMAGES 

• COLORED PHOTOS 

• BLACK OR VERY BLACK AND WHITE DARK PHOTOS 

• GRAY SCALE DOCUMENTS 

IMAGES ARE BEST AVAILABLE COPY. 



As rescanning documents will not correct images, 
please do not report the images to the 
Image Problem Mailbox. 



PCT 



WORLD INTELLECTUAL PROPERTY ORGaTjI^TION 
International Bureau 



# 

)RGANIi 




INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT 



(51) International Patent Classification ° : 

C07H 21/04, A61K 39/12, 39/245, C12Q 
1/70. 1/68, G01N 33/53, C12P 1700, 
C12N 5/00. A61K 39/00. 39/395 



Al 



(11) International Publication Number: 
(43) International Publication Date: 



\VO 97/2720S 

3! Julv 1997 f31.0~.s- 



(21) International Application Number: PCT/US 97/0 1442 

(22) International Filing Date: 28 January 1997 (28.01.97) 



(30) Priority Data: 

08/592.963 
08/757.669 



29 January 1996 (29.01.96) 
29 November 1996 (29.1 1.96) 



US 
US 



(11) Applicant: THE TRUSTEES OF COLUMBIA UNIVERSITY 
IN THE CITY OF NEW YORK [US/US]; West 11 6th Street 
and Broadway. New York, NY 10027 (US). 

(72) Inventors: CHANG. Yuan; Apartment 3J, 90 Momtngside 
Drive New York. NY 10027 (US). BOHENZKY. Roy. A.; 
Apartment 1 15. 870 East El Camino Real, Mountain View, 
Ca 94040 (US). RUSSO, James. J.; Apartment 25E, 60 
Haven Avenue, New York. NY 10032 (US). EDELMAN. 
Isidore, S.; Apartment 61. 464 Riverside Drive. New York. 
NY 10027 (US). MOORE. Patrick. S.; Apartment 3J, 90 
Momingside Drive. New York. NY 10027 (US). 

(74) Agent: WHITE, John. P.. Cooper & Dunham L.L.P., 1185 
Avenue of the Americas. New York. NY 10036 (US). 



(81) Designated States: AL\ CA. JP. MX. European paten: (AT. 
BE. CH. DE. DK. ES. FI. FR, GB. GR. IE, IT. LL . MC 
NL. PT, SB). 



Published 

With international search report. 



,54, Title: UNIQUE ASSOCIATED KAPOSI'S SARCOMA VIRUS SEQUENCES AND USES THEREOF 
(57) Abstract 

This invention provides ar. isolated nucleic acid molecule which encodes Kaposi's S i™ ma - A "« ial ^ 
mIvmdSLs This invention provides an isolated polypeptide molecule of KSHV. Thts mvent.on provides an antibody spec.fi. to ite 
^ Inrn ri"' Amisense and triplex oligonucleotide molecules are also provided. This mvent.on provides a vaccine for Kaposi s Sarcoma 

, vaccnation, prophylaxis, diagnosis and treatment of a subject with KS and of detecting expression 

l of a DNA virus associated with Kaposi's Sarcoma in a cell. 



Applicants: Yuan Chang, el al. 
Serial No. : 09/607.179 
File J: June 2 9. 2000 
Exhibit 7 



WO 97/77208 



PCT7US97/014-I 



Kaposi's sarcoma -associated herpesvirus (KSHV; 1 = a 
new human herpesvirus (KHVS) believed :: cause 
5 Kaposi's sarcoma (KS) [1,2] . 

Kaposi's sarcoma is the most common neoplasm occurring 
in persons with acquired immunodeficiency syndrome 
(AIDS ) . Approximately 15-20% of AIDS patients develop 

0 this neoplasm which rarely occurs in immunocompetent 

individuals. Epidemiologic evidence suggests that 
AIDS-associated KS (AIDS-KS) has an infectious 
etiology. Gay and bisexual AIDS patients are 

approximately twenty times mere likely than 

.5 hemophiliac AIDS patients to develop KS , and KS may be 

associated with specific sexual practices among gay 
men with AIDS . KS is uncommon among adult AIDS 
oacients infected through heterosexual or parenteral 
HIV transmission, or among pediatric AIDS patients 

> 0 --fec-ed through vertical HIV transmission. Agents 

previously suspected of causing KS include 
cytomegalovirus, hepatitis B virus, human 

papillomavirus, Epstein-Sarr virus i EBV ■ , human 
he roesviru s £ , human i mmuno de f i c i e n cy v :ru= .' H I V , an o 

25 Mvcoolasma penetrans. Nor- infectious environmental 

agents, such as nitrite inhalants, also have been 
proposed to play a role m KS tumcrigenesis . 
Extensive investigations, however, have net 
demonstrated an etiologic association betweer. any of 

30 these acrents and AIDS-KS. 
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TTCT7Q TKEREOF 



was mace v 



under a :o-c?era:ivt: a:r:eT:er.: 
ti che Centers for Disease Control ana 



The :nven::cn disclosed 
Government support 
CCU2106 52 from 

Prevention, and under National Institutes c: 
National Cancer Institute award CAc73?l - ~ ne 
Apartment of Health and Human Services. Accordingly, 
-he U.S. Government has certain rights m tnis 
invention . 

This aoDii«-.iw is a PCT ir.-.Srnational A==lica-icn 

claiming priority of U.S. serial No. '~"l\y_~"Z 

~ ~ - of U.S. Serial No. C £ / 5 r 2 , 56 j , 
November 2 r , = a..a ^ 

. . - -c -.aa; which is a cont mua 1 1 on - m - 

tlleO uQllualv - - • ~ - - ~ < 

olrt application of ?CT International Application No. 
«^ /uc V 5 /-. 5126, filed November 21, = ans 

F2T/US95/1C154, filed August 11, 19?5. claiming 
— conty of U.S. Serial No. 0B/42C 225, ^ 
11, 1995 and of U.S. Serial No. 06/34 

-GG-- w"-- 5 ^- ■' s a continuation-in-part c: 
Novemoer -r-,-, * 

U.S. Serial r- 

which is 



fileo A?ri. 

- - l pH 



No . 0 S / 2 ? 2 , * c S , filed August 1 = , _ r ?^ 



Throughout this application, various publications jnay 
be referenced by Arabic numerals in brackets. Full 
— -a-ions fcr tnese publications may be found at the 
end of the Detailed Description of the Invention. The 
disclosures or a_- puclicat-on^ 

their entirety hereby incorporated cy rererenoe :r.-> 

= , -• fullv describe tne state of 

this app- - ~» --'-'** - w -'-^ — - 

-.he art zo which "is ir.v«ncion ?er:r — 
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PPTF.F DESCRIPTION OF THE FIGURES 
F-icrure 1: 

Annotated long unique region £LUR: ana terr.ina, 
5 repeat (TR) of the KSHV genome. The orientation 

of identified ORFs J in the LUR are denoted by the 
direction of arrows, with ORFs similar to H~v5 in 
dark blue and dis-similar ORFs in light blue. 
Seven blocks (numbered) of conserved herpesvirus 

10 genes with nonconserved interblock regions 

(lettered) are shown under the kilcbase marker; 
the block numbering scheme differs from the 
original description by Chee (Chee er ai . , 1990, 
Curr. Topics Microbiol. Immunol. 154, 12 5-169; - 

15 The overlapping cosmid (2 prefix) and lambda (1 

ere f ix ) clones used to map the KSKV genome are 
compared to the KS5 lambda phage clone from a KS 
lesion and shown below. Features and putative 
coding regions no: specifically designated are 

2 0 shown above the OR? map. Repeat regions are 

shown as white lines If ml:, vnct , waka'jwka, 
zcoa, mci, mask). Putative coding regions and 
other features \ see Experimental Details Section 
I : not designated as ORFs are shown as solid 



Figure 2A-2D: 

{Fig. 2A) Sequence of terminal repeal unit fTR) 
demonstrating its high G- 2 content (SE2> I-' 

30 NO: 16) Sequences highly similar to conserved 

herpesvirus pad sites are underlined with less 
similar sites to specific pari and pac2 sequences 
italicized. (rig. 2B) Southern blot of ONA = ror, 
- ~ _ -. (lane 1} BCP-1 (lane 2) and a KS lesion 

3- ■; lane 3) digested with Nde Z I which cuts once m 

-he TR seauence and probed with a plasmid 
containing the 7R sequence. The intense 
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QTTMMRKY OF ^ TWVENTION 

This inversion provides ar. isc_ateo r.u:ic^ 
molecule which encodes Kaposi's Sar:o^=-Ass:::a:e: 
Herpesvirus (KSHV) polypeptides . This - r ^= n ^ 1 J 5n 
provides ar. isolated polypeptide . molecule of K=KV 
This invention provides an antibody specific tc tne 
polypeptide. Antisense and triplex oligonuc iect ice 
molecules are also provided. This invention provides 
a vaccine for Kaposi's Sarcoma (KS). This inven.ion 
provides methods of vaccination, prophylaxis, 
diaanosis and treatment of a subject with KS ana o: 
detecting expression of a DNA virus associated witn 
Kaposi's sarcoma in a cs-1- 
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Both KSHV KIP genes encode 19 residue N-:err.:r.u = 
hydrophobic secretory leader sequences which are 
relatively poorly conserved ( vMIF- I also has a 
second C-C dimer in the hydrophobic leader 
5 sequence without similarity to the chemc>:ine 

dicysteine motif). Potential O- linked 

glycosylation sites for vMIP-I (gapped positions 
22 and 21) are not present in vKIP-II, which nas 
only one predicted potential serine glycosyl = c ion 
10 site (position 51) not found in vKIP-I. Fir. 3F. 

Alignment of the KSHV v 1 1 - 6 to human IL-t. rig. 
3C-1 and 3C-2. Alignment of the KSHV vIRF 
polypeptide to human ICSBP and ISGF3 with the 
putative ICS-binding typtophans (W) for ICSH? and 
ISGF3 in italics. 
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Figures 4A-4F : 

Northern hybridization of total RNA extracted 
from 3CP- 1 and BC-1 cells with cr without 4 5 nour 
incubation- with TPA and control F3HR1 cells after 
TPA incubation. All four genes (Fig. 4A, vKIP-I; 
Fig. 4B, vKIP- I I ; Fig. 4C , vIL-€; Fig. 4D, vIRFJ 
are TPA inducible but constitutive, ncnmcuced 
expression of vIL-€ (Fig. 4C) and vIRF 'Fig. 4D> 
is also evident for 31?- 1 and BC-1 and of vKIP-I 
for BCP-1 (Fig. 4A) . Representative 
hybridizations to a human L~-accm prcoe (Figs. 
4F-4F) demonstrate comparable loading or RNA tor 
cell oreoarat ions . 



Figures 5A-5B: 

Fig. 5A. Immunobict of raboit antipeptide 
antibodies aenerated from ammo acid sequences or 
v I L - £ , TKYS ? PKFDR ( S EQ ID NO : 2 ) and PDVTFCVKDR 
35 ( SZQ IC- NO:2), against cell lysates if BCP-1, 

BC-1, P3HF.1 cell lines with and wnncut TPA 
induct i or. (lanes 1-6) , 1 ug human rIL-£ .^ane 7; , 
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hybridization band at 0 . S kb represents 
cosies of the Ndell-digested single unit i- 
2C; . A schematic representation '.rig. -- 
genome structures of KSHV in 3C?-l and EM = a-- 
lines consistent with the data presentee in 

. _. "-an" (~; s ; tes flan): the TF. 

2BI- ana (Fig. 2D ' • jaq " 

-ecions and Nde II (N) sites are within the TRs . 
Lower case tr refers to the deleted truncated TR 
unit at the left end cf the unique region. DR 
represents the duplicated region of the LUR 
buried within the TR. (Fig. 2D) Southern blot 
hybridization with TR probe of DNA from 3C- - 
(lane 1), BCP-1 (lane 2), a KS lesion (lane 35, 
a-d -3L-~ (lane 4) digested with Tag I, whicn 
does not cut in the TR . Tag I -digested DNA from 
bo - h 3C-1 (lane I! and HBL-6 (lane 4) snow 
similar TF. hybridization patterns suggesting 
identical insertion of a unique sequence ir.ee tne 
TR region, which sequencing studies derr-.cr.strate 
20 i S a duplicates portion o. — * — _ ' s 

Experimental Details Section; . SC±--- 
hybridization (lane 2) shows laddering consistent 
with a virus population having variable Tr; region 
lengtns within this cell line due tc lytic 
25 replication. The absence of TR laddering in KS 

lesion DNA (lane 2) suggests that a ciona. virus 
population is present m the tumor. 

Figures 3A- 3C : 

CLUSTAL W alignments or KSHV-en ceded polypeptiae 
sequences to corresponding human cell signaling 
pathwav polypeptide seances . Fig. 3 A. Two KSrTv 

;vMI?-: and vy.:?-I") are 
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f/Tp-like polypepc iaes : vr* 

compared to human KlP-lQ. KI?-l£ ar.c 
; amino 



acid identity :o vMIR-I indicated by fc.a-K 



reverse shading, to vMIF-II alone by gray reverse 



shading , 



and the C-C dimer motif is ita-iciz 
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demonstrates vIL-6 production tr. nctn 
KSHV- infected cell lines and tissues. Tne 
KSKV- infected cell line BCF-1 (Fig. 7A: , cut net 
the control E5V-infected cell line F3KF.1 Fig. 
73) , shows prominent cytoplasmic vIL-6 
localization. (Fig. 1C) Cytoplasmic localization 
of vIL-6 in spindle-shaped ceils from an AIDS-KS 
lesion. Of eight KS lesions, only one had 
readily identifiable vIL-6 staining of a 
subpopulation of cells. In contrast, the 

majority of pelleted lymphoma cells from a 
nonAIDS, EBV-negative PEL have intense vIL-6 
staining (Fig. 7E) . No immunos taming is present 
ir. control angiosarcoma {Fig. 7D) or multiple 
25 j.yeioma tissues (Fig. 7F) . 

Figures BA-8D: 

Double antibody labeling of ant i - vlL- € ana eel- 
surface antigens. Examples of both CD34 ar.c CD20 
co-localization with vIL-c were founc m a KS 
lesion. Fig. SA. CD34 ired) and v!L.-€ cclccalize 
[blue) in a KS spindle cell (arrow; . Purple 
coloration is due to overlapping ohromagen 
staining (100X). Fig. SB. CD 4 5 common leukocyte 
antigen staining {blue, arrow; on vIl-6 (red) 
expressing Kaposi's sarcoma ceils (100X). Fig. 
3C. Low power magnification {2 OX; demonstrating 
numerous vIL-6 producing hematopoietic cej.j.s 
r r ed ) in a lymph node from a patient witn KS . 
Arrows only indicate the mos: prominently 
staining cells; nuclei count erst ained with 
Hematoxylin. Fig. SD . Colocai icat ion of CD 2 0 
: brown, arrows; with i red; m an AIIS-KS 

cat lent ' s lymph node f 1 0 Q X 1 . 
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Figure 9: 
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3 0 



^ ccr.centra:ec COS7 rvIL-S an: 



(lanes 6-9:. Ar.ti-v_I.-6 --^--^ 



and 

supematants 
specifically recognize ^r.e i.r a i - [ \ 

in both recombinant: supematants ar.c ce_- 

bur r.c: human _L»-6. The BCr-i ~ ^ _ - 

conscitutively expresses low levels cf ^ 
whereas poiype?- ide expression increases or. TFA 
treatment for both BC-1 (KSHV and E3V ccmfecced- 
and BCP-1 (KSHV infection alone: indicating -yrir 
pnase expression. Pre immune sera from immuniceo 
rabbits did net react on immunoblott ing to any or 
rhe preparations. Fig. 5E . Ar.t i -huIL- c 

monoclonal antibodies do not cross-react with 
cell -associated or recombinant vll-f 

ureoarations . 



p-i gure 6 ; 

• chvmidine 



_ ■_ d j - - - — * - 
res ■ and 



; , C- J = .i 



Dose -response curves for -'H-trr, 
IL-e- dependent E5 mouse plasmacytoma c 
serial dilutions cf rhuIL-6 ;ri_iea squ 
CCS 7 supernat ants or rv^i-c il *- ie - ° 
r6-Llv iopen squares; cr co 
circles '' pMET7 r ransf e ct ions . 
supernarants from this transrecticn 
similar B? proliferation activity to nu. 
ng/ml whereas the reverse construct '.r- 
the la:Z control show no increased acint 



1 c r s now 
L-~ >C . CI 
- L I v ■ and 



inauce 



- *-i =5 7" 



supemaranrs ar greater -nm. 
have increased activity due to cenrent 
rQc 7 conditioning factors. 



Figure ^ 7A-7F: 
Race i - 

coal i tec using coat - ant i rabbit immunogiocu 



ar.:;-v!l-6 pepciae anricocy 



peroxidase conjugate (brown: wirh nenat o:y_ ir. 
counterstaining (blue j ar XI CO magnification 
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nr.TRTT.ED DF-fiCTRIPTIOW OF THE INVENTION 

Per ir.it: ions 

5 The following standard abbreviations are used 

throughout the specification tc indicate specific 
nucleotides : 

C=cytosine A=adenosine 
1C T=thymidine G=guanosine 

The term "nucleic acid", as used herein, refers to 
either DNA or RNA, including complementary DNA icDNA) , 
genomic DNA and messenger RNA imRNA) . As usee nerem, 

15 "aenomic" means bcth coding and non- coding regions or 

the isolated nucleic acid molecule. "Nucleic acid 
sequence" refers to a single- cr double- stranded 
oolvmer of deoxvribonucleotide cr ribonucleotide bases 
read from the 5' to the 3' end. It includes boch 

20 self -replicating piasmids , infectious polymers of DNA 

cr RNA and nonfunctional DNA cr RNA. 

The term "polypeptide", as used herein, refers to 
either the full length gene product en cooed by the 
25 nucleic acid, or portions thereof. Thus, 

■ " polypectide " includes not only the full-length 
protein, but also part lal - length fragments, including 
peptides less than fifty amino acid residues in 
ienath . 



30 



The term. "3SC" refers to a c i t rat e - sa _me sciucxcr or 
0.15 K sodium chloride and 20 nK sodium citrate. 
Solutions are often expressed as multiples or 
fractions of this concentration. For example, cXSSC 
refers to a solution having a sodium chloride ana 
sodium citrate concentration of £ times this amount or 
C . 9 M sodium chloride and 12 0 m." sociurr. citrate . 
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Quantification of ^v./^ c 

crimary KSI SFI52 and M22 HIV-l strains and k:v-1 

strain ROD/3 in the presence or absence =r 

vKIP-I. CC=/CD4 cells were transiently 

; cotransfected with CCP.5 alone, CC== ?ius e-.= ty 

r^c, nl'^s vMI p -I i« oMET? vec_cr, 
oME7~ vecior, L^r-s V1 '- 

cr CCR5 olus the reverse or:enta:icn I-?IKv. Tne 
-esu.lt s "after 72 hours of incubation with each 
retrovirus are expressed as a percentage of the 
feci forming units for ceils transfecteo witn 
CCR5 alone. The forward vMI?-: construct 

inhibited NSI HIV-l replication but not 
replication while the reverse I-?IMv construct 

_ _i *j , - r- ~ a r v ci tne 

had no efiect on rep-^aL^.i — 

retroviruses . 
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lenc-h sequences. It being further understood "at 
the sequence includes the degenerate coders of the 
native sequence cr sequences which may be intrceueec 
to provide codon preference in a specicic nest ce_- . 

5 

A nucleic acid probe is "specific" fcr a target 
organism cf interest if it includes a nucleccide 
sequence which when detected is determinative cf the 
cresence cf the organism in the presence cf a 

10 heterogeneous population of proteins and ccner 

biologies. A specific nucleic acid probe is targeted 
tc that portion of the sequence which is determinative 
of tne organism and will not hybridize to ether 
sequences, especially those of the host, where a 

15 pathogen is being detected. 



The chrase "expression cassette" , refers to nuc_ecciae 
sequences which are capable of affecting expression of 
a structural gene in hosts compatible with such 
2C sequences. Such cassettes include at least promoters 

and octionaiiy, transcription termination signals. 
Additional factors necessary or helpful in effecting 
expression may also be used as described nerem . 

25 The term "eperably linked" as usee herein refers tc 

linkage of a promoter upstream from a DNA sequence 
such that the promoter mediates transcription of the 
DMA sequence. 

30 The term "vector", refers to viral expression systems, 

autonomous self - replicating circular DNA :plasmids-, 
and includes both expression and honexpressicn 
rlas":ds . Where a recombinant microorganism cr ce^- 
culture is described as hosting an "expression 

35 vectrr, " this includes both extra chromosomal circular 

DKA and DKA that has been incorporated into tne host 
c ~r Z ~~ )SC>r > e { s ; . Where a vector is being mamtamec by 
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G.2XSS1* refers to 



a 



solution C . 2 :imes 



1C 



concentration or 0.03 M sodium chloride and -: m?-: 
sodium citrate - 

Tne ohrase "selectively hybridizing to" an- tne pr.rase 
"specific hybridization" describe a nucleic scic prcoe 
that hybridizes, duplexes or binds only to a 
particular target DNA or RKA sequence when tne target 
sequences are present in a preparation :: tc-ta- 
cellular DNA cr RNA. By selectively hybridizing it is 
meant that a probe binds to a given target m a manner 
that is detectable in a different manner fro* non- 
target sequence under high stringency conditions of 
hybridization . 

"Complementary" or "target" nucleic acid sequences 
refer tc those nucleic acid sequences ^wnich 
selectively hybridize to a nucleic acid probe, proper 
annealing conditions depend, for example, upon a 
trcbe's length, base composition, ana tne number c_ 
mismatches and their position on the probe . and must 
often be determined empirically. For discussions o^ 
nucleic acid prone design and annealing conditions, 
see, for example, SamorooK e: al . ; .^.ecu.ar 

25 Cloning: A Laboratory Manual :2nd ed . , , -c-c Spring 

Harbor Laboratory, Vols. 1-2 or Ausunei, ez ai . 

(19871 Current Prozccols in Molecular Biology, New 
York . 

30 The ohrase "nucleic acid molecule encoding" reiers tc 

a nucleic acid molecule which directs tne expression 
of a specific polypeptide. The nucleic a::: sequences 
in elude both the DNA strand sequence tnat i = 
transcribed into RNA, the complementary DKA strand, 

2S and zhe sequence that is translated into protein. 

The nucleic acid molecule includes both tne ru._ 
ienath nucleic acid sequence as we., as ntn-ru_^ 
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Waterman (1981) Adv. Appl . Marh. 2:452, by tne 
alacrithm of Needlemar. and Wunsch (1970) J". ATc. . Bi r_ . 
48:443, by the search- for- similar! cy method of Pearson 
and lipmar. (1988) Proa. Nazi. Acad. Sci . £5; 2444 , or 
5 by computerized implementations of these algorithms 

{GAP , EESTFIT , PASTA, and TFASTA in GCG , the Wisconsin 
Genetics Software Package Release S.C, Genetics 
Computer Group, 575 Science Dr., Madison, WI ) . 

10 As applied to polypeptides, the terms "substantial 

identity" or "substantial sequence identity'* mean tnat 
two peptide sequences, when optimally aligned, such as 
by the programs GAP or 3ESTFIT using default gap which 
share at least 90 percent sequence identity, 

15 preferably at least 95 percent sequence identity, more 

preferably at least 93 percent sequence identity or 
more . 

"Percentage amino acid identity" or "percentage amino 
20 acid sequence identity" refers to a comparison of the 

amino acids of two polypeptides which, when optimally 
• aligned, have approximately the . designated percentage 
of the same amino acids. For example, "95% amint acid 
identity" refers to a comparison of the ammo acids of 
2 5 twc ocivDectides which when optimally alignec have ro% 

amino acid identity. Preferably, residue positions 
which are not identical differ by conservative ammo 
acid substitutions. For example, the substitution of 
amino acids having similar chemical properties, such 
3C as charge or polarity, are net likely to effect the 

properties of a protein. Examples include glutamme 
for asparagine or glutamic acid for aspartu acio. 

The ohrase "substantially purified" or "iso^atec" wr.en 
2 5 referring to a herpesvirus pciypept i de , means a 

chemical composition which is essentially free cf 
other cellular components. It is preferably in a 
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a host cell. the vector may either be 
replicated oy tne =e-i= — = m_.~=- 

suVor.omous structure, or is incorporated witr.ir. tne 

host ' s genome . 



The term "plasmid" refers to an autonomous circular 
DNA molecule capable of replication in a cell, and 
includes both the expression and nonexprsssicr. types. 
Where a recombinant microorganism cr cell culture is 
described as hosting an "expression plasmid", this 
includes latent viral DNA integrated into the host 
-V-omosome(s) . Where a plasmid is being maintained oy 
a host ceil, the plasmid is either being seamy 
reoiicated by the cells during mitosis as an 
autonomous structure or is incorporated within tne 
host ' s genome . 

-~ phrase "recombinant protein" cr " recombinant!-/ 
educed croteir." refers to a polypeptide produced 
20 "u.ina non-native cells. The cells produce the protein 

= ^ n«*r a»n^"-'callv alterec oy the 
because tney wa -. r e *je-i- y_.i _ 

iZrr-duczion'cf the appropriate nucleic acid sequence. 



The fcl lowing terns a 



re used to describe the sequence 



relationships between two cr more nuc.eic acic 
molecules-, "reference sequence", "comparison window", 

- - 0 ^--^ v n "o^^r.taae of sequence 

" se cruence ice — - - > * y - - — i - - ^ _ 

identity", and "substantial identity". A "rererence 



•I „ _ , c epn'io^"^ usee as 
se cruence is a ce^ne- seq- 



^asis tor 



„ „„ _ ^ - e^-n ■ s -»f»>-ence secuer.ee nay be » 
3C sequence comparison, a 

• " - ^- - -=--r^- s«cr-«n'-~ for examcle , as a segment 
suoset or a -arge_ s-^- * 

full-length cDKA or gene sequence given in a 



C _ a 

s 

aene sequence . 



ecuence listing cr may comprise a ». 



Optimal alignment cf sequences in a comparison window 
mav be 



conducted by the algorithm of St.:::. ana 
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"Biological sample" as usee nereir. 
sample obtained from a living organism or sr. 
organism -hat has died. Examples cf 
samples include body fluids and -issue specimens. 

5 

It will be readily understood by those skii.ea m tne 
art and it is intended here, that when reference is 
made tc particular sequence listings, such reference 
includes sequences which substantially correspond to 

1Q t K e listing and it's complement, including a _ _owar.ee s 

for minor sequencing errors, single base changes, 
deletions, subst itutions and the like, such that any- 
such sequence variation corresponds to the nucleic 
acid secuer.ee of the pathogenic organism cr u_sease 

15 marker tc which the relevant sequence listing relates. 

- _ Nucleic Acid Molecule from KSHV 

T^is invention provides an isolated nucieic acic 
20 molecule which encodes a Kaposi's sarcoma - associatec 

herpesvirus (K5KV) polypeptide. 

In one embodiment, the isolated nuclei e acid moiecule 
which encodes a KSHV polypeptide has the nucleccide 

25 secuence as set forth in GenBan- Accession Number 

U7 5c3 5 and the start and stop codons set rortn m 
Table 1. In another embodiment, the isolateo nuc_eie 
acid molecule which encodes a KSKY polypeptide has the 
ammo acid secuence defined by the translation c: the 

30 nucleotide sequence set forth in C-enBank Accession 

Number 1775S98 and the start and step codons set forth 
m Table l - 

In one embodiment, the isolated nucleic aeia me_ecu_e 
3 5 for a KSHV polypeptide has the 5' un trans. a tec 

sequence as set forth m GenBank Accession Number 
U7 = = 95 upstream cf the ATG start coder.. In another 
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homogeneous sia:e although it can be m either a dry 
or aqueous solution. Purity and homogeneity 
typically determined using analytic-- cne.^istry 
techniques such as polyacrylamide gel eleccr centres i s 
or high performance liquid chronar ograpny . A prcteir. 
which" is the predominant species present in a 
preparation is substantially purified. Generally, a 
substantially purified or isolated protein wn. 
memorise mere than 80% of ail macrcmolecular specie^ 
1C present m the preparation- Preferably, tr.e protein 

< * pu-ified to represent greater than 50% zz a__ 
Macrcmolecular species present. More preferably the 
oroteir. is purified to greater than 95%, anc most 
Preferably the protein is purified to essential 
homogeneity, wherein other macromoiecuiar species are 
not detected by conventional techniques. 

Th e ohrase "specifically binds to an anticccy 1 oi 
"specifically immunoreact ive with", when referring to 
2C a polypeptide, refers to a binding reaction which is 

determinative cf the presence of the KSHV pc,y?e r :iae 

the invention in tr.e presence -~ « : 3 — 

peculation of polypeptides anc other bic-ogics 
^ r ."_ :udin - viruses ether than KSHV . Thus , unaer 
25 designated immunoassay conditions, tne specir^ec 

antibodies bind tc the KSHV antigen and dc net bint m 
a significant amount to other antigens present in the 
samole . 



" Soecific binding " to an antioocN 
conditions max 



- reouire an antibody tnat is se-„e- 



for its specificity for a particular antigen. rcr 
example, antibodies raised tc KSHV antigens cescrioec 
herein can be selected tc oecain ^nticcaies 
specifically immunoreact ive with KSHV pc.ypeptioes anc 
not with ether p 
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fcr 2 hours at 37° C pricr to infection. Infe__ec 
cells are observed by demonstrating r,cr?hcicri:a. 
changes, as well as being viral antigen positive. 

5 For KSHV isolation, the virus is either harvested 

directly from cell culture fluid by cent:*: f ugat ion , cr 
the infected cells are harvested, homogenized or lysed 
and the virus is separated from cellular debris and 
ourified by standard methods of isopycnio sucrose 
10 density gradient centrif ugation . 

One skilled in the art may isolate and propagate KSHV 
employing the following protocol. Long- term 

establishment of a B lymphoid cell line infected with 

15 KSHV (e.g., RCC-1 , K3L-6 or 3CBL-1) is acccmpi ished 

usinc body- cavity based lymphomas and standard 
techniques {Click, 1980, Fundamentals cf Human 
Lymphoid Culzure, Marcel Dekker, New York; Knowles et 
al . , 1969, Blood 73, 752-796; Metcalf, 1984, Clonal 

20 Culture of Hemazopoeizic Cells: Teohniques and 

<Acci icazi ens , Elsevier, New York) - 

Fresh lymphoma t issue containing viable infected cells 
is filtered tc form a single cell suspension. Tne 

25 cells are separated by ' Ficoil - Plaque cent ri f ugat ion 

and iymphocvte layer is removed. The lymphocytes are 
then placed at >lxI0 s cells/ml into standard lymphocyte 
tissue culture medium, such as RPMI 1640 supplemented 
with 10% fetal calf serum. Immortalized lymphocytes 

30 containing KSHV are indefinitely grown m the culture 

media while non- immortalized cells die curing course 
cf ore longed cultivation. 

Further, KSHV may be propagated in a new ceil line by 
3 5 removing media supernatant containing the virus rrom 

a continuously- infected cell line at a concent ra: ion 
of >lxl C'- ceils/ml. The media is centrif uged a: 2 3 0 Ox g 
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embodiment, the isolated nucleic acid mo.ecuie = =r = 
KSHV polypeptide has the 3' untranslated sequence as 
set forth in GenBank Accession K umber U75frr 
downstream cf the stop codon . 

In one embodiment the isolated nucleic acio x.c_ecu-e 
is genomic DNA . In another embodiment the isolated 
nucleic acid molecule is cDNA. In another embodiment 
RKA is derived from the isolated nucleic acid mc.ecu-e 
or is capable of hybridising with the isolated nuclei c 
acid molecule. 

Further, the nucleic acid molecule above may oe 
associated with iymphoproiif erat ive diseases 

15 including, but not limited to: Hodgkin's disease, ncn- 

Hcdgkm's lymphoma, lymphatic leukemia, lymphosarcoma, 
splenomegaly, reticular cell sarcoma, Sezary's 
syndrome, mycosis f ungoides , central nervous system 
l^phoma, AIDS related central nervous system 

20 lymphoma, pose- transplant iymphepre 1 i r erac i ve 

d ; scrders ( and Burkict's lymphoma. A _ymp.no- 

crclif erative disorder is characterized as being the 
uncontrolled clonal or polyclonal expansion c: 
Iv^or.ccyces involving lymph nodes, lymphoio tissue ano 

25 other organs . 

A Isolation and Propagation tt r.^.-. -.• 

KSHV can be propagated ir vitrc. For example, 
3 0 techniques for growing herpesviruses nave oeen 

described by Ablashi ez al . in Vi rc± cgy _ ~ , rj^r-rz-. 

macrophage, neurona_ 
cocultivated with cerebrospinal 
~ 5 cerirherai blood leukocytes, cr tissue extract 

containing viral infected cells or purified virus 
7~ e recioient cells are treated with E ug/mi pciyoren 



FHA stimulated ccrd Diced mononuclear ce_.s 



fluid, c lasTia 
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The method fcr isolating the KSHV genome is based on 
Peiiicer et al.,1978, Cell 14, 133-141 and Gibson and 
Rcizmann, 1972, J. Virol. 10, 1044-52. 

5 A final method fcr isolating the KSHV genome is 

clamped homogeneous electric field iCK™' gel 
electrophoresis . Agarose plugs are prepared by 
re suspending cells infected with KSHV in 1% IMP 
agarose (Biorad) and 0 . 9% NaCl at 4 2 ° C to a final 

10 concentration of 2.5 x 1C T cells/ml. Solidified 

agarose plugs are transferred into lysis buffer { C . 5K 
EDTA pH 8.0, 1% sarcosyl, proteinase K at 1 mc/'ml 
final concentration) and incubated for 24 hours . 
Approximately 10' cells are loaded in each lane. Gels 

15 are run at a gradient of 6.0 V/ cm with a run time of 

2 8 h on a CHEF Mapper XA pulsed field gel 
electrophoresis apparatus ( Bicrad } , Southern blotted 
and hybridized to KS£313am, KS 2 3GB am and an EEV 
terminal repeat sequence. 

2 0 

To make a new cell line infected with KSHV, aiready- 
infected ceils are co- cultivated with a Raji cell line 
separated by a 0.45u filter. Approximately, 1-1 x 10 r 
a ireadv- infected 3CEL-1 and 2x10' Raj i cells are co- 

2 5 cultivated for 2-20 days in supplemented EP:-:i alone or 

• with 2 0 ng/ml 12 - O- te trade canoyl phorbol - 1 3 - acetate 
CTPA: . After 2-20 days co- cultivation , Raji ceils are 
removed, washed and placed in supplemented R ?>' Z 16 4 0 
media. A Raji culture co- cultivated with 3C31 - 1 in 20 

3 0 ng/ml T?A for 2 days survived and has' beer, kep: m 

continuous suspension culture for >1 0 weeks. This 
cell line, designated RCC-1 (Raji Co -Culture, Kc.i; 
remains FCR positive for the KSHV sequence afcer 
multiple passages. R2C-1 ceils periodically undergo 
3 5 rapid cytoiysis suggestive of lytic reproduction tf 

KSHV . Thus , RCC-1 is a Raji cell line newly-m reeled 
with KSHV. 
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for 10 minuses and f -leered through a C.45t titter zz 
remove cells. The media is applied in a 1:1 vt_ume 
with cells growing at >lxl0 4 cells/ml for 45 hours. 
The cells are washed, pelleted and places ::. -re sr. 
5 culture median, then tested for KSKV after 14 cays. 

KSHV may be isolated from a cell line in the following 
manner. An infected cell line is lysed using standard 
metnods, such as hyposnottc shock or Dounce 

10 nomogenization or using repeated cycles of freezing 

and thawing in a small volume (<3 ml}, and peiietec at 
2000xg for 10 minutes. The supernatant is removed and 
cencrifugec again a: 10, OOOxg for 15 minutes to remove 
nuclei and organelles. The resulting low- speed, cell- 

15 free supernatant: is filtered through a 0.4Su filter 

and centnfuged at 100, COOxc for I hour to pellet the 
virus. The virus can then be washed and re -pellet ea. 
The UNA is extracted from the viral pellet by standard 
techniques (e.g., phenol/ chloroform) and tested for 

20 the cresence cf KSKV by Southern blotting ano/cr ?CR 

usinc the soecific probes described above. 

For banding whole virion, the low-speed cell-free 
supernatant _s acjusteo to _o..- Q -.: - -~ t-w-. 
25 ?EG-supernatant is spun at 10,000 xg for 30 mm. The 

supernatant is poured off and the pellet collected and 
re suspended in a small volume (1-2 mi; or virus cutter 
(V3, 0.1 N. NaCl. 0.01 K Tris, pK 7.5! . The virion =.r, 
isolated by 
30 sucrose gradient made with VH . 

the gradient are obtained by standard techniques 
(e.g., using a f ract icnatcr ; and each fraction is 
tested by dot blotting using specific hybridizing 
probes tc determine the gradient traction containing 
3 5 the purified virus (preparation cf the fraction :s 

needed in order to detect the presence cf tne virus, 
i.e., standard DN'A extraction) . 



bv oentrif ugation at 15,000 rpm m a 10-50% 



;n e m_ tractions or. 
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In cr.e embodiment the molecule is DNA. In ar.::r.er 
embodiment: the molecule is RKA. 

In one embodiment the TR molecule contains c;s-a::;va 
5 elements required for DNA replication anc packarir. g . 

In another embodiment the TR molecule is contained in 
a gene-cloning vector. In another embodiment the TR 
molecule is contained in a gene-therapy vector. Ir. 
another embodiment the gene -therapy vector is 
1C exoressed in lymphoid cells. In another embodiment, 

the TR comprises a molecular marker for determining 
the clonality of a tumor. In another embodiment, the 
marker provides a defining feature of the natural 
historv of a tumor in a diagnostic assay. 



This i r.ven t i on nrov i de s a 3-1 ymp ho t r op hi c DNA vector 
comprising a plasmid or other sel f - repli cable DNA 
molecule containing the 8 01 bp KSHV TR or a portion 



thereof . 



Hiah stringency hybrid! za t ion conditions are 



selectee 



for the specific sequence at a define a ionic strengtn 
25 and pK . The T_ is the temperature '.under defined ionic 

strength and pH- at which 50% or tne sa_t 
concentration is az least about 0.C2 molar at pH 7 and 
the temperature is at least about 6 0 ° C . As other, 
factors may significantly affect the stringency of 
3 0 hybridization, including, among 'others, base 

composition and size of the complementary strands, the 
presence of organic solvents, i.e. salt cr fcrmamide 
concentration, and the extent of base rr.ismat enmg , tne 
combination of parameters is more important than the 
3 3 absolute measure cf any one. Fcr example, hign 

strinaency may be attained by overnight hybridization 
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RCC-I and RCr-i =r£ were deposited cr. C^ober If, ^19?- 
under ATCC Accession Nc . CRL 11734 ana C.-_ " ' z ' 
respectively, pursuant :c the Budapest Treat-/ cr. tn=- 
International Deposit cf Microorganisms :cr tne 
5 Purposes of Patent Procedure with the Fat er.: Culture 

Depository of the American Type Culture Ccl_e.cicn, 
123 01 Parkiawn Drive, Rockvilie, Maryland 2 06 52 U.S. 
HBL-6 was deposited (as BHL-6) on November 15, 1994 
under ATCC Accession No. CRL 11762 pursuant to the 
10 Budapest Treaty on the International Deposit^ cz 

Microorganisms for the Purposes of Patent Proceoure 
wirh the Patent Culture Depository of the American 
Type Culture Collection, 12301 Parkiawn Drive, 
Rockvilie, Maryland 2 0852 U.S.A. 



This invention provides a nucleic acid molecule cr 
least 14 nucleotides 
hybridizing 
as sec 

U75699, U75700. 



capable of specifically 
with the isolated nucleic acid molecule 
forth m GenBank Accession "umbers U75698, 



In one embodiment the nucleic acid xo-ecui_ szz rcrtn 
ir. GenBank Accession Number "J756 5S comprises the long 
unique region <LUR) encoding KSHV polypeptides. In 
another embodiment tne nucleic acid molecule set forth 
in GenBank Accession Number U75S99 comprises the 

reoeat ■ TR ) . lr. ^~n-h< 



prct otypi cai 

30 embodiment the nucleic acid molecule set rortn 

GenBank Accession Number U757CC comprises tr- 
incomplete terminal repeat ■ ITR .* . 

In one embodiment the molecule is £ ic 3 6 nucleotides 
3 5 In another embodiment the molecule is ~c 



nucieot iaes . 
nuolect ides . 



In another embodiment tne mciecu. 
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it contains an upstream promoter ir. the presence :: 
the appropriate RNA polymerase. 

As defined herein nucleic acid probes may be DKA or 
5 RNA fragments. DNA fragments can be prepared, for 

examole, bv diae sting plasmid DNA, or by use cf ?CR , 
or synthesized by either the phosphor ami ai te method 
described by Beaucage and Carruthers, 1951, 
Tetrahedron Lett. 22, 1859-1862 or by the triester 

10 method according to Katteucci et al . , 1981, Air. . Cheir, . 

Sec. 103:3135. A double stranded fragment may then be 
obtained, if desired, by annealing the chemically 
synthesized single strands together under appropriate 
cenditions or by synthesizing the complementary strand 

15 using DNA polymerase with an appropriate primer 

sequence. Where a specific sequence for a nucleic 
acid orobe is given, it is understood that the 
complementary strand is also identified and included. 
The complementary strand will work equally well in 

2 0 situations where the targe: is a double - stranded 

nucleic acid. It is also understood chat when a 
specific seauence is identified for use a nucleic 
probe, a subsequence cf the listed sequence which is 
25 oase pairs (bp; cr more in length is also 
25 encomcassed fcr use as a probe. 

The nucleic acid molecules of the subject invention 
also include molecules coding fcr polypeptide analogs, 
fragments cr derivatives cf antigenic polypeptides 

3 0 which differ from natural ly- occurring terms m terms 

cf the identitv or location of one or more amine acid 
residues (deletion analogs containing less than all cf 
the residues specified for the polypeptide, 
substitution analogs wherein one or more residues 
2 5 specified are reoiaced by other residues and addition 

analogs where in one cr more amino acid residues is 
added to a terminal or medial portion cf the 
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temperature with £X SSC solution, folioweo ry wasn:.. r 
a: about 6 8 °C 



;p°C in a C . 6X SSC solution. 



Hybridization w::r. moderate stringency may be attamec 
for example by: 1) filter pre-hybridiz^ng and 
hybridising with a solution of 2X SSC, 50% f ormamide , 
0*.1M Tris buffer at pH 7.5, 5X Denhardt ' s solution; 
2.) ore-hybridization at 37°C for 4 hours ; 3' 
hybridization at 37°C with amount of labeled probe 
lal to 3, CDC, COO com total for 16 hours; 4) wash m 



eou 

>: 

each at room temper 
minutes each; and 6) dry and expose to film. 



■; SSC and 0.1% SDS solution; 5) wash 4X for 1 minute 

ature in 4X SSC at 6C C C for 3 C- 



Nucleic acid probe technology is well known to tnose 
skilled in the art who readily appreciate that sucn 
probes may vary greatly in length and may be labeled 
with a detectable label, such as a radioisotope or 
fluorescent dye, to facilitate detection of the probe. 
2C UNA probe molecules may be produced by insertion of a 



:KA molecule navmg 



tne full-length or a fragment 



the isolated nucleic acid molecule of tne DNA virus 
into suitable vectors, such as piasmios or 
bacteriophages, fciiowed by transforming into suitaoie 
ba-terial host cells, replication m the transformeo 
bacterial host ceils and harvesting of the DK*A probes, 
using metnods well known m the art. Alternatively, 
probes may be generated chemically from DNA 
synthesizers. 

RNA probes may be generated by inserting tne lull 
length or a fragment of .the isolated nucleic acio 

_ j: - v- ^ nv^ -'- — '-=; downstream or a 
mcxecu_e ^- «--e 

bacterioohage promoter such as T2 , 7 ■■ or i=rr= . _^rge 
amounts cf RKA probe may be produced by mcuoa t mg the 
labeled nucleotides with a linearized isolated nucleic 
acid molecule of the DMA virus or its fragment where 
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In another embodiment, DKFR has the ammo a:;: 
sequence as set forth in SEQ ID KO : 1 . 

In another embodiment, KSKV 3HFR is inhibited by a 
sulfa drug known to inhibit bacterial DKFR. In a 
preferred embodiment, KSKV DKFR is inhibited by- 
met hot rexate cr a derivative thereof known to inhibit 



mammalian 
druc , 



DHF?.. In another embodiment, the =u_t = 
methotrexate or a derivative thereof is 



10 selective among the human herpesviruses for inhibition 

of KSKV. 

This invention provides the isolated KSKV polypeptide 
comprising thymidylate synthase (TS) encoded by OFF 

15 70 . in one embodiment, TS participates in KSHV 

nucleotide metabolism. In another embodiment , TS 
comprises an enzyme essential for viral replication, 
inhibition cf which prevents virus production. In 
another embodiment, TS comprises a subunit vaccine. 

20 In another embodiment, TS comprises an antigen tor 

immunologic assays. 

This invention provides- the isolated KSKV polypeptide 
comorismg DNA polymerase encoded by ORF r . In one 

25 embodiment , OKA polymerase comprises an enzyme 

essential for viral replication, inhibition cf .which 
nrevents virus production. In another embodiment, DNA 
polymerase comprises a subunit vaccine. In another 
embodiment, DNA polymerase comprises an antigen for 

3 0 immunologic assays. 

This invention provides the isolated KSKV polypeptide 
corner! sir. a alkaline exonuciease encoded by ORF 37. In 
cne embodiment, alkaline exonuciease packages KSKV DNA 
35 into the virus particle. In another embodiment, 

alkaline exonuciease comprises an enzyme essential tor 
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and which share some or 



naturally-occurring terms. These molecules 



polypeptides ; 
of 

include: the incorporation of codor.s "pre: errec ,! crr 
exI3ress ior; by selected non-mammalian noses; tne 
prevision of sites for cleavage by rescnct_cn 
endonuciease en cymes ; and the provision cf adciticna. 
initial, terminal or intermediate DNA sequences that 
facilitate construction of readily expressed vectors. 

c _ 1 peptides of KSKV and Antibodies 

( Ab' s '> Thereto 

This invention provides an isolated KSHV polypeptide, 
one from the list as set forth in Table 1 and below. 



This invention provides the isolated KSHV polypeptide 
comorisinq viral macrophage inflammatory protein - - - 
(vKIP-IIIi- ' n one embodiment, vMIP-IH comprises an 
orohan cytokine . In another embodiment, vMIP-111 is 

20 encoded by nucleotides 22,529-22,185. In anctner 

embodiment, vMIP-III comprises an ant i - mz lammac cry 
~^ uc: 3 oref erred embodiment , the -rug is useful 

i:: treatment of an auto immune disorder. In tne most 
oref erred embodiment, tne drug is useful m treatment 

25 cf rheumatoid arthritis. 

This invention provides the isolatec KSrT* pc 
comprising dmydrof cl at e reductase {DHFR; encoded by 
GR.F 2. In one embodiment, DHFR participates m r^-v 

30 nucleotide synthesis. In another embodiment, DHF?. 

comer ises an er.zyirie essential for viral rep _i cation, 
inhibition cf which prevents virus croouction. m 
another embedment , DHFR comprises a sub-unit vaccine. 
In another embodiment, DHFR comprises an antigen for 

3 5 immunologic assays. 
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This inversion provides the isolated KSHV 
comprising viral protein kinase encoded by OR? 3r. lr. 
another embodiment, viral protein kinase comprises an 
antiaen for immunologic assays. In ar.ccr.er 

5 embodiment, viral protein kinase comprises a subur.it 

vaccine . 

This invention provides the isolated KSKV polypeptide 
comprising lytic cycle transactivator protein :_."?; 

1C encoded by OR? 50. In one embodiment, LTT? is 

required for activation of productive infecticn from 
the latent state. In another embodiment, LCT? is 
inhibited by known antiviral drugs. In anotner 
embodiment, prevention of LCT? expression maintains 

15 the virus in a latent state unable to replicate. 

This invention provides the isolated KSHV pclypept iae 
comprising ribonucleotide reductase, a cwc-suour.it 
enzvme in which the small and large subur.it s are 

20 encoded by OR? 60 and OR? 61, respectively. In 

another embodiment , ribonucleotide reductase catalyzes 
c c n v e r s i c n of ribonucleotides into 

deoxyribonucleotides for DNA replication. In anotner 
embodiment , ribonucleotide reductase is inhibited by 

25 known antiviral drugs m terminally di i iereni iated 

cells noi expressing cellular ribonucleotide 
reductase. In another embodiment, ribonucleotide 
reductase comprises an antigen for immunologic assays. 
In another embodiment, ribonucleotide reductase 

30 comprises a subumt vaccine. In anotner embodiment, 

ribonucleotide reductase comprises a trans r crmmg 
aaent for establishment of immcrtaiizec ce__ _mes. 

This invention provides the isolated KSHV pciypeptioe 
35 comprising the protein encoded by OR? Ki . 
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production. Ir. another embodiment. alkaline 

exor.uciease comprises a subunit vac::ne. Ir. ar.otner 

embodiment, alkaline exonuclease comprises an antigen 
fcr immunologic assays. 

' "his invention provides the isolated KSHV po-ypepoioe 
comprising helicase-primase . subunits 1. 2 and 3 
encoded by ORFs 40, 41 and 44, respectively. In one 
-Tbodiment , heiioase-primase comprises an er.ryme 
10 activity essential for viral DNA replication. In 

another embodiment, helicase-primase is ir.nicitec oy 
nucleotide analogs. In another embodiment, helicase- 



onmase is inhibited by known antiviral arug= . 
mother embodiment, ir.r.u 
prevents KSHV replication. 



another embodiment, inhibition of heiicase-prt...ase 



This invention provides the isolated KSHV polypeptide 
comprising uracil DNA glycosyiase (UDC-i encoded by OR? 
4 , one ex-bodinent, uracil DNA glycosyiase 

cone-rises an enzyme essential fcr KSHV DNA repair 
during DKA replication. Ir. another embodiment , uracil 
DNA glycosyiase is inhibited by known antivira- orugs . 
Ir. another embodiment, uracil DNA giy-_osy_ase 
ccnorises a subunit vaccine. In ar.otner emoco-m-r. - , 
uracil DNA ' glycosyiase comprises an antigen rcr 
imnunC'lo-1- assays . 



This invention provides the isolated KSHV polypeptide 

cor^risir.a single- stranded DNA binding rrcrein = r 
30 encoded by G?.F Q€ . In cne embodiment ,• SSH? comprises 

ar . enzw.e esser.t.al :or KSHV DI*A rep.::a:io-. In 

c<5« < = :nh;bire; by known 



an o -her e^:a:iT!er.. 
an: ivirai 
increas 

such as ir. the 



es zr.e processiv::y of p^iy^erase reactions 
:or.ven.ional ?-H nethoa :cr DNA 
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This :nve:.:icn provides the isolated KSKV polvpep: i de 
comprising vMIP- I encoded by ORF K6 . In one 

embodiment, v.MI?-I comprises an ant i - inf lamma t cry 
druc. In a preferred embodiment, tne orug i ; 



20 



5 in treatment of an autoimmune aisoraer . in t r_e most 

preferred embodiment, the drug is useful in treatment 
cf rheumatoid arthritis. 

This invention provides the isolated KSKY poller ride 
1C comorisinc the protein encoded by ORF K7 . 

This invention provides the isolated KSHV polypeptide 
comprising .Bel -2 encoded by ORF IS. 

15 ThiE invention provides the isolated KSHV polypeptide 

comprising capsid protein I encoded by ORF 1". 

This invention provides the isolated KSHY polypeptide 
comprising the protein encoded by ORF ie. 



This invention provides the isolated KSHV po-ypeptiae 
comprising tegument protein I encoded by OFF 19. 

This invention crovides the isolated KSHV polypeptide 
25 comprising the protein encoded by ORF 20. 

This invention provides the isolated KSHV polypeptide 
comprising thymidine kinase encoded by ORF 21 . 

3C This invention provides the isolated KSHV polypeptide 

comprising glycoprotein H encoded by OR.F 22. 

In one embodiment ; the isolates K3KV polypeptide 
comorises the proteir. encoded by ORF 23. 

This invention provides the isolated KSHY polypeptide 
ocnor i sing the protein encoded by ORF 2 4 . 
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This invent: icr. provides 



-he isolated ?; = HV pc:\-p=r::^ 



comprising complement -binding protein w-^r; ~~- 
encoded by OR? 4 . 

This invention provides the isolated KSKV ptlypeptiae 
comprising transport protein encoded by GF.F ' . 

This invention provides the isolated KSKV polypeptide 
comprising glycoprotein B encoded by ORF 6. 

This invention provides the isolated KSHV polypeptide 
comonsing the protein encoded by ORF 10. 



This invention provides the isolated KSHV polypeptide 
comprising the protein encoded by OFF 11. 



Th^s invention provides the isolated KSHV po.ypepti e 

comprising viral mterieukin 6 (vIL-S; encoded cy ORF 

K2 ~ - n cne embodiment, antibodies selectively 

. . ,.- T a a i i nw d J ' - -erentiat icr. among 
recognizing vxL-b a^io* u 

lymphomas . 

This invention provides the isolated ?:SKV polypeptide 
comorismg BHV4-IE1 1 encoded by ORF K3 . 



This invention provides the isolated KS 
vM I P - 1 1 encoded by ORF 



KV oolypeptiae 
vjl . In one 



vMI?-:: comprises an ant i - inflammatory 



comprising 
embodiment , 

drug. in a preferred embodiment . ".he drug is useru. 
in trea-mer:: of an autoimmune disorder. Ir. the rr.ost 

_ , ^, c ncof^*; - *- treatment 
preferred enooaitnen. , -r.e J- — — 

cf rheumatoid arthritis. 

This invention provides the isolated KSHV poiypeptiae 
- on — risinc: EHV4-IE1 II encoded by CF.F F.5 
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This inversion provides Che isolated KSKV pc.v r e.:;2= 
cctr.crisir.ci the oroteir. encoded by ORF 36. 

This invent icr. provides the isolated KSKV polypeptide 
5 comprising glycoprotein K encoded by ORF 39. 

T u-^ £ —vent ion provides the isolated KSHV polypeptide 
coxorising the protein encoded by OR" 42. 

*C This invention provides the isolated KSHV polypeptide 

comprising capsid protein III encoded by ORF 42. 

This invention provides the isolated ?:SKV polypeptide 
cor.orisinc virion assembly protein encoded by ORF 45. 

This invention provides the isolated KSHV polypeptide 
ccrr.onsing glycoprotein L encoded by ORF 47. 

This invention rrcvides the isolated KSHV polypeptide 
20 comprising the protein encoded by ORF 45. 

This invention provides the isolated KSHV polypeptide 
cor.cn si rig the protein encoded by ORF 4 9. 

^ This invention crovides the isolated KSKV polypeptide 

cor.orisina the protein encoded by G?„F KB . 

This invention provides the isolated KSHV polypeptide 
comer isina the crotein encoded by ORF 2 . 

30 

This invention provides the isolated KSHV polypeptide 



comprising the protein encoded by 



This invention provides the isolated KSKV polypeptide 
corner is ma dUTPase encoded by ORF 54 . 
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This inversion provides the isolated KSHV polypeptide 
comprising major oapsid protein encoded by ORr . 

This invention provides the isolated KSHV polypeptide 
comprising capsid protein II encoded by ORF 2c. 

This invention provides the isolated KSHV polypeptide 
mo-isino the protein encoded by CRF 27. 



cox, 

10 This invention prov 

cotr.D 



3 0 



ides the isolated KSHV ?c-:-?=.- - 



lunsmg the protein encoded by OR? 26. 



This invention provides the isolated KSHV polypeptide, 
comer ismg packaging protein II encoded by CRF 2 re 

This invention provides the isolated KSHV polypeptide 
comprising the protein encoded by OR? 2 0. 

This invention provides the isolated KSHV polypeptide 
20 comprising the protein encoded by ORF 21. 

This invention provides the isolated KSHV polypeptide 
comprising the protein encoded by ORF 32. 

_ ^ - r ,vn--noc; - r & isc' area KSHV DCijoeptias 
25 This invention crosses ^--a--*- 



com' 



-r-^ sir.c the orotein encoded oy wr.r - - • 



the isolated KSHV polypeptide 



encooeo dy 



This invention provides 
comprising packaging pn 

This invention provides the isolated KSHV polypeptide 
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This invention provides "he isolated KSKV po:\-pec::;e 
comprising :e?unen: protein III encoded by OR? 64. 

This invention provides the isolated KSKV pc 2 ypep tide 
5 comprising capsid protein IV encoded by ORF 65. 

This invention provides the isolated KSKV polypeptide 
ccmprisinc the orotein encoded by ORF 66. 

1C This invention provides the isolated KSKV polypeptide 

comprising tegument protein IV encoded by ORF 67. 

This invention provides the isolated KSKV polypeptide 
comer i sing glycoprotein encoded by ORF €S . 

15 

This invention provides the isolated KSKV polypeptide 
corner i sine the crctein encoded by ORF 65. 

This invention provides the isolated KSKV polypeptide 
20 comprising Kapcsin encoded by ORF K12 . 

This invention provides the isolates KSKV polypeptide 
comer is ino the orotem encoded by ORF K13 . 

2 5" This invention provides the isolated KSKV polypeptide 

comprisina eve 1m D encoded by OF.F 72 . 

This invention provides the isolated KSKV polypeptide 
comprising immediate -early protein ■ I—? • encoded by 

3 0 ORF 73 . 

This i n ven t ion provides the isolated KSKV pel ypep 1 1 d e 
comprising OX - 1 encoded by ORF K14 . 

35 This invention provides the isolated KSKV polypeptide 

comprising G-crotem coupled receptor encoded by ORF 
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This inversion provides the isolated XShS ?^K-?--- 
comprising the protein encoded by OR? 55. 

This invention provides the isolated KSKV pc: ypep t -~ o, 
comprising DNA replication protein - er.~wae- 
56 . 

This invention provides the isolated KSKV pciypeptio- 
comprising Mediate early protein II (IEF-iI: encooe: 
bv OR? 57 . 



T 4 



.■his invention provides the isolated KSKV polypeptide 
comprising viral interferon regulatory factor - 
ivIRFl; ICSE?) encoded by ORF'XS. In one enb==in>.er.t . 
vIR-1 is a transforming polypeptide. 

This invention provides the isolated KSKV polypeptide 
cc-trising the protein encoded by ORF K1C . 

This invention provides the isolated KSKV polypeptide 
comprising the proteir. encoded by ORF Kll 

This invention provides the isolated KSKV polypeptioe 
coir.prisir.g phosphcprotein encoded by OFF 55. 

, nv »--icr. orovides the isolated KSKV polypeptide 
co^rismg DNA replication protein II encoded by ORF 



59 . 

This invention prw^3e» — °- 

comcrising assembly /DKA maturation protein - ' 3e - 
ORF 62. 

This invention provides the isoiatec r.~-. a - 
conorismg tegument protein II encoded by CrS c 2 . 
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This invention provides an antibody whirr, speci:::^--;' 
binds :o the polypeptide encoded by the isolates 
nucleic acid molecule. In one embodiment: the antibody 
is a monoclonal antibody. In another embodiment one 
5 ar.tibcdy recognizes an . epitope cf the K=HV 

oolvoet'tide . In another embodiment the antmccy is s 
polyclonal antibody. In another embodiment tr.e 
antibodv recognizes more than one epitope cf the KSHV 
ooivoeotide . In another embodiment the antibody is an 
1C ant: - idiotypic antibody. 

An antibody, polypeptide or isolated nucleic acia 
molecule may be labeled with a detectable marker 
mcludina, but not limited to: a radioactive label, or 
-_5 s colcrimecric , a luminescent, or a t-ucrescent 

marker, or geld. Radioactive labels include, but are 
not _imited tc : \- , f , ? * - ' ' 



■* ' £■ e , 



5v V - 2 -i f and 16 *Re . Fluorescent mar.-iers 



include, but are not limited to: t -uorescetn , 
rr.odamme and suramins . Colorimetric markers include, 
cut are net limited tc: biotm, and diacxigenm. 
Methods o: producing the polyclonal or monoclonal 
ancibcdv are known to those of ordinary skill m tr.e 



Further, the antibody, polypeptide cr nucleic acid 
molecule may oe detected by a second antioccy which 
may be linked to an enzyme , such as alkaline 
chosohatase cr horseradish peroxidase . Cther enzymes 
3 0 which may be employed are well known to one of 

ordinary skill in the art. 

This invention provides a method oi procuring a 
rcl\noectide encoded by the isc.atec nucleic acio 
3 5 mole rule , which comprises growing a host -verier system 

under suitable conditions permitting production or tne 
cclvoeccide and recovering tne pr^ypepcice so 
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«h<, ^nver.tion provides -he isolated KShv polype? -- ~ 
comprising tegumer.-- prorein/FGARAT encoded by ORF 

This invention provides -he isolated KS.-.v^ po ivpep - - = = 
comprising the protein encoded by O.-.F K15 . 

This invention provides the isolated KSKV polypeptide 
cc-crising viral interferon regulatory factor 2 
ivIRF2> encoded by nucleotides SE , 910- 85 , 4 10 . 

This invention provides the isolated KSHV polypeptide 
-o-norisina viral interferon regulatory factor 3 
!v;RF3) encoded by nucleotides 90 . 541- 89 . 600 . 

This invention provides the isolated KSHV polypeptide 
cciT-.orisir.g viral interferon regulatory factor 4 
i--T^r-4' ; encoded by nucleotides 94 , 127- 2^ , e jc - 

This invention provides the isolated KSHV polypeptide 

:onicr:5:nq a precursor oi se..e.-c - _ 
encoded by nucleosides & 0 , 172 - 9 C , 64 1 . 

This ir.vsn-i.on provides one isolated KSHV pclypepcioe 
-:leo::des 25,661-25,741. 



enccae-o oy 



>1 vr>eo_iae may be linked c: 



Fur- her, the isclanec po_yr 
s 

3 0 the isc: 

acid molecule and expression in a suicao.e nest ce_- 



econd polypeptide :c _orm _ -_.= _-i- . _ 

; iated nucleic acid molecule to a seccnc nuc.eic 



;ne embodiment one sec one nuz.eic a 



cid. molecule 



encodes bee a - galactose ida=.e . u-nei ..-_ ( ^-- 

molecules which are used cc form a lusion pro_e. 



_ ~ ~, ~ » — ".- - t _> - n one dri . 
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Polyclonal antibodies against the poiypept ide r,sy ce 
produced by immunizing animals using a selectee. . 
polypeptide. Monoclonal antibodies are prepared using 
hybridoma technology by fusing antibody producing E 
cells from immunized animals with myeioma ce_is ano 
selecting the resulting hybridoma cell line producing 
the desired antibody, as described further below. 



Z. 'J 



3D 
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produced. Suitable host cells include bacteria, 
veasc. filamentous fungal, plant, insect ana mamna^ar. 
cells. Host-vector systems for rrcaucm; . an: 

^covering a polypeptide are well known zz those 
skilled in the art and include, but are not i^- = a = 
to, E. coli and pMAL (New England Eiclabs;. tne SzS- 
insect celi-bacuiovirus expression system, and 
—Us (such as HeLa, COS, NIH ?T3 and 
iz-"->o- \ -ransfscted with a mamma ix an express:::. v " ClC1 
bv "Jioo't'ectm (Gibco-BRX.) or calcium phosphate 
precipitation or other methods to achieve vector entry 
into "the cell. Those of skill in the art are 
knowledgeable in the numerous expression systems 
< - = >o » f sr expression of KSHV polypeptide. 

r V y is ir V entior. provides a method to seiec: specizic 
^gions or. the po^-pep^de encoded by the isolated 
nucleic acid molecule of the DNA virus to generate 
antibodies. Amino acid sequences may be analyzed by 
2Z methods well Known t: those skiliec m one ar. ^o 

determine whether they produce hycropnccic or 

bu!ld7"Vn\^Va« of a cell memcrane polypeptide, 
hvdrophobic regions are well >:nown tc fcrr. one part ot 
25 the polypeptide tnat is inserted into tne -ipid 

bilaver of the cell membrane, wnile hydrcpnt^ic 

^ ^ 0 - --^ -o^l surface, m an aqueous 

re-ions are iOw&_e— u.- 

-brcr-hilic regions will be 



environment. Usually, tne ny: 

m ore immunogenic cnan the hydrophobic regions. 

_ - , „ a--™ <=»ou o nces rr.av be 
Therefore tne nycrcpr._-i- 

selected and used to generate antibodies spenitc to 
- ^— ^o"- - j. A — encoded bv the iso — atec nuc_e — a~j.-> 
-j * ~ * -r^o^'n"" virus. The selected 

oeotides may be prepared using commercia-iy av=_-=io±e 
35 - a ~:— nes. as an alternative, nucleic acid may oe 

cloned and expressed and the resulting polypeptide 
recovered and usee as an immunoge..- 
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found in the following U.S. Pat. Nos . 4,376,11: i^svic 
e: al - ) or 4 , 09E , S7£ (Piasio) . 

A. ftqsavs frar KSHV Pc:\T>ec- idg Antiaer.s 

One car. use immunoassays to detec: the virus, its 
components, or antibodies thereto. A general overview 
cf the applicable technology is "in -Harlow and Lane 
,1983^ Antibodies , A Laboratory Manual, Cold Spring 
Harbor Publication, New York. 

In one embodiment, antibodies to KSKV polypeptide 
antigens can be used. In brief, tc produce 

an - ibcdies , the polypeptide being targeted is 
e-rcressed and purified. The product is injected into 
a mammal capable of producing antibodies. Either 
oclycional cr monoclonal antibodies { including 
recombinant antibodies) specific for the gene product 
v^ e used in various immunoassays. Sucr. assays 
include comoetitive immunoassays, radioimmunoassays, 
Western blots, ELI SA , indirect immune f lucrescent 
assavs and the like. For competitive immunoassays, 
see Harlow and Lane at pages 567-573 ■ and 554-5B9. 

Monoclonal antibodies cr recombinant antibodies may be 
obtained by techniques familiar to those skilled m 
the art. Briefly, spleen cells or other lymphocytes 
from an animal immunized with a desired antigen are 
immortalized, commonly by fusion with a myeloma cell 
-.see, Kchler and Mil stein, 19 76, Eur. J". Immune ^ . £, 
311-519). Alternative methods of immortalisation 
include trans f crmaticn with Epstein Barr Virus, 
ence genes, or retroviruses, cr ether methods well 
known m the art. Colonies arising from single 
immortalized cells are screened for production oz 
antibodies of the desired specificity and affinity for 
che antiaen, and yield of the moncc-cr.ai antioocies 
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The antibodies raised against KSKV pc.ypeztice 

be detectabiy labeled, 



antigens may 

CO 

a— as described above 



nventicnal labelling techniques well -know., -z -ne 



In addition, enzymes may be used as labels. Suitable 
enzvmes include alkaline phosphatase, beta- 
gaiactosidase, glucose -6 -phosphate dehydrogenase, 
maieate dehydrogenase and peroxidase. Two prmcipa. 
ty p«s 0 ; er.zvme immunoassay are the enzyme - _ ir..-:ea 
immunosorbent' assay (ELISA) , and the homogeneous 
enzyme immunoassay, also known as enzyme -nu_ t i? -tec 
immunoassay ( EMIT , Syva Corporation, Palo Altc, TA. . 
In - h e ELISA system, separation may be achievec, tor 
example, by the use. of antibodies coupled to a solid 
= t--^ "MIT svstem depends on deactivation or tne 

encyrr- ir. the -racer-antibody complex; activity is 
chus measured without the need tor a separation step. 



Additionally, chex.iiuninescer.t compounds may be used 
as labels. Typical chexi luminescent compounds include 

iumincl 
2 5 imidazoles 

Similarly, o 

for labelling, the bicluminescenc compounds i-cucing 
luciferm, iucirerase, and aecuorm . 



isoiummol, aromatic acricmiu... - - - ^ - - - 
acridinium salts, and oxalate esters. 
>ioiuminescent compounds may be utilized 



30 A description o 



f a radioimmunoassay ' r. 1 A . : may oe .--n. 



in: Laborazory Techniques ir. Bi ochemis cry arse 
Molecular Biology (1375) Norm Holland -uclisnmg 
Co mo any , New York, witr. particular "-- r5 -- 
chanter entitled "An Introduction to Radioimmune Assay 
and Related Techniques " by T . Chars. A aescrip n 
aeneral immunometric assays or various -yp-~ °- 
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The orotein is -hen purified by a comb ins t ion :r c = -- 
l ysis (e.g., scnicaticr/; and affinity chronatcgracr.y . 
Fcr fusion products, subsequent: digestion :: one 
fusion protein with an appropriate proteo_ytic encym- 
5 releases the desired peptide. 

The polypeptides may be purified to substantial purity 
by standard techniques well known in the art, 
^udina selective precipitation with such substances 
10 as ammonium sulfate, column chromatography. 

immunopurification methods, and others. See, for 
instance, Scopes, 1982, Prozeln Purification: 
Principles and Practice, Springer- Verlag , New 



v r-. 



1 /. . 



E , ^essvg for Antibodies Soeciricaliv r;r.-.^ 
^o ? r .5HV Pcl'^peptides 

Antibodies reactive with polypeptide antigens c: KSHV 
can aisc be measured by a variety of immunoassay 
methods tnat 
abov< 

procedures applicable to 



;d c that are similar to the procedures nescrioeo 
r e for measurement of antigens. For a review or 



measurement of antibodies by immunoassay 



immune logical anc immunoassay 
the 

techniques, see Basic and Clinical Iwmunciogy , 7t_- 

and lane , 



Edition, Stites and Terr, Eds., and Kariow 
l as 

Harbor, New York. 



1986, .Antibodies, A Labor a zory Manua- , Ccic Spring 



to measure antibodies reactive 



Dcivreciide antigens of KSHV can be either 



In brief , immunoassays 
3 0 with 

competitive or noncompetitive binding assays. In 
competitive rinding assays, the sample analyte 
competes with a labeled analyte fcr specific bmomg 
sites on a capture agent bound to a sclio surrace. 
3 5 Preferably the capture agent is a purified recombinant 

human herpesvirus polypeptide produced as oescrioec 
above . ' Other sources of human herpesvirus 
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croduced by such ceils may be enhanced by varicus 
t-chniaues, including ir.jec.icn inr.c zhe pe--=nca- 
cavicv of a vertebra-.e hosr. . Newer a cliques us.n, 

:n svs terns -~r. a_s. 



2C 
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recombinant phage antibody expressior 
be used to generate monoclonal ar.:iDCi:es. 



v — y e: .'1990; Nature 24:, = = - 

exaopi - : Kw / 



_ ^_ _ 7 f n q 9 1 ) Nuc • Acids Res . 13 
Marks ez al . (1991) ^- Woi ° io2 ■ 222 ' 581-55 

Methods for characterizing naturally processes 
peptides bound to MHC (major hiscocompacibi 1 ity 
comolex) I molecules can be used- See Talk e: a.., 

-e- ^qp and publication No. WC 

1991, Narure ^5_, 2 90 ana ^- - p-^- 

G n / ^ 1 0 2 3 published K overr.be r 26, 19 92. - yp - t « _ -l , 
:hese methods involve isolation of MHC class ^1 
molecules by immuncprecipicat ion or ~rrinity 
^romatoaraphy from an appropriate cell or ce_. _me. 
Otner methods involve direct amino acid sequencing or 
-he mo-e abundant peptides m various HrLC fractions 
by known automatic sequencing of peptides eluted from 
Class I molecuies o. -ne n c 

_ 7 19 91, Nature 3 53, 3 2 6; , anc c_ - - ~ ~ 

T- - r - i --—^ bv mass soeccrome try- 
class I moiecuie , h-^.-*^.- ->.-•- ~> 1=1 ^ ^ ^ 

; Hun r e: a :. ( 1991, Eur. J- Immunol . 21, 2961-29^). 
See also, Roczschke and Faik, 1951, I^cl . Today 12, 
447 , for a general review of the cnaracter i cat ion or 
naturally processed peptides m MHC class 1. rurcher, 
Marloes e: ai - , 1991, £ur. J. Znxnunzl . 21, 296,-2972^ 
describe how class 1 binding motifs can oe app-ieo to 

. - ..x., ~- Do^ntis: viral immunogenic 

the iaer.ti_j.Co .... 

peptides in virrc. 

• • ~ n i> - < i ; - — - --duced bv 
The pclypepciaes aescr.oe- 

recombinant technology may oe purified by stanoaro 
techniques well known to those of s>:il- m tne =r ^ . 

-i _ _ • „ _ * ^ f~ \ or* ■ c; can oe 
Recombinantiy proaucea \irc-_ - 

directly expressed or expressed as a fusion prctem. 
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This invention provides a repiicable vecccr crr^cir.mg 
-he isolated nucleic acid molecule encoding a K = KV 
cclypepcide. The vector includes, but is net iimitec 
to: a plasmd, cosmid, X phage or yeast artificial 
chromosome (YAC) which contains - the isolated nucleic 
acid molecule. 

To obtain the vector, for example, insert an: vector 
DNA can both be exposed to a restriction enzyme to 
create complementary ends on both molecules which case 
pair with each other and are then ligated together 
with CKA ligase. Alternatively, linkers car. be 
i-aated tc the insert DNA which correspond :: a 
r~ s - >-i ion site in the vector DNA , which is cnen 
digested with the restriction enzyme which cuts at 
that site. Other means are available and well-known 
- c "hose skilled in the art. 

_ hl _ i nV ention provides a host cell containing the 
vector. Suitable host ceils include, but are not 
limited tc, bacteria (such as E. coil), yeast, fungi, 
clant, insect and mammalian cells. Suitable animal 
25 cells include, but are not limited to Vero cells, HeLa 

cells, Cos ceils, CV1 ceils and various primary 
mammalian ceils. 

This invention provides a transgenic nonhumar. mammal 
30 which comprises the isolated nucleic acid molecule 

L -t reduced into the mammal at an embryenic stage . 
Methods of producing a transgenic nonhuman mamma- are 
known to those skilled in the art. 



,5 => 
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ocivpeptides, including isolated cr partially ?^»-« 
naturally occurring polypeptide , -ay also be usee. 

Kor.:on ? e-.it:ve assays are typically sassw::.-. 

5 ir. which the sample analyte is bour.c c=twser. two 

analvte-soecific binding reagents. One c: the binding 
agents is used as a capture agent and is bour.c to a 
sciid surface. The second binding agent is lareiec 
and is used to measure or detect the resultant complex 

-, c b v visual or instrument means. number ot 

-ordinations of capture agent and labeled cinomg 
asert can be used. A variety of different immunoassay 
formats, separation techniques and labels can axsc oe 
uses similar to those described aoove rcr tne 

15 measurement of K5KV polypeptide antigens. 

Kemacciutinacion Inhibition (EI) and Complement 
Fixation <") are two laboratory tests that car. b« 

ucod :c de:ec: infection with r.urr.ar. 

2Z :e£:ing fcr the presence of antibodies against tne 

virus cr antigens of the virus. 

_ - • , v.t°". cne wishes 

Serological met hoes can a.sc oe use v.r._.. 

to detect antibody to a specific viral variant. Fcr 

— nav wish to see how well a vaccine 



exaiDie , 

recipient has responded to a new preparation cy assay 



of oat lent 
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In one enbcdimen: the nucleic ac;c molecule fro- the 
-umor lesion is amplified before step .(b) - In anccner 
embodiment the polymerase chain reaction ' "R is 
employed to amplify the nucleic acid molecule. 
Methods of amplifying nucleic acid molecules are known 
to those skilled in the art. 

A person of ordinary skill in the art will be able co 
obtain appropriate nucleic acid sample for diagnosing 
Kaposi's sarcoma in the subject. The DNA sample 
obtained by the above described method may be cleaved 
by restriction enzyme before analysis, a technique 
well-known in the art. 

Ir. the above described methods, a size fractionation 
may be employed which is effected by a polyacryiamide 
ae -_ ; n cne embodiment, the size fractionation is 
effected by an agarose gel. Further, transferring the 
nucleic acid fragments into a solid matrix may be 
employed before a hybridization step. One example ot 
such solid matrix is nitrocellulose paper. 



This invention provides a method c: oetectmg 
expression of a KSKV gene in a cell which comprises 

25 obtaining mRNA from the cell, contacting the mRNA 

with a labeled nucleic acid molecule of KSKV under 
hybridizing conditions, determining the presence of 
mRKA hybridized tc the molecule, thereby detecting 
expression of the KSKV gene. In one embodiment cDKA 

3 0 is prepared from the mRNA obtained from the cell ano 

used to detect KSKV expression. 

Accepted means for conducting hybridization assays are 
known ano general overviews cf the technology can oe 
35 had from a review of: Nucleic Acid Hybri di za ticn : A 
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Hi . n:acr.ss ^ " Assavg - 

This invention embraces diagnostic test Kits tor 
detecting the presence of KSKV in biciogica. sanp.es. 
such as skin samples or samples of other atreccec 
-issue, comprising a container containing anuc.eic 
acid sequence specific for a KSKV polypeptide and 
instructional material for performing the test. A 
container containing nucleic acid primers to any one 
of such sequences is optionally mcludeo. 

This invention further embraces diagnostic test kits 
-'o- detecting the presence of KSKV in biological 
samples, such as serum or ' solid tissue samples, 
comprising a container containing antibodies to a KSKV 
polypeptide, and instructional material for pert orming 
the test. Alternatively, inactivated viral particles 
or polypeptides derived from the human herpesvirus may 
be used in a diagnostic test kit to detect antioodies 
20 specific for a KSKV polypeptide. 

A. n~ irleic & i d As s a vs 

This invention provides a method of diagnosing 
Kaposi's sarcoma in a subject which comprises: (a) 
■obtaining a nucleic acid molecule from a tumor lesion 
cr a suitable bodily fluid of the suoject; (b: 
ccntactina tne nucleic acid molecule with a iaoeled 
nucleic acid molecule of at least 15 nucleotiaes 
capable of specifically hybridizing with tne i^ciacec 
nucleic acid molecule of KSKV unaer nyor z-i.~ 
conditions; and ic de 
nucleic acid molecule hyorioizeo, 

which is indicative of Kaposi's sarcoma. in the 
subject, thereby diagnosing Kaposi's sarcoma m tne 
subi ect . 



rermining the presence if the 
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Vol. 152, (1967) Academic Press, New York; cr 
Hybridizazion wizh Nucleic Acid Probes , pp. 4 95-524, 
(1993) Elsevier, Amsterdam . 

5 Usually, a: leas' a part of the probe v.- 11 have 

considerable sequence identity with the target nucleic 
acid. Although the extent of the sequence identity 
required for specific hybridization will depend cn the 
lencth of the probe and the hybridization conditions, 
10 the crobe will usually have at least- 70% identity tc 

the target nucleic acid, more usually at least 30% 
identity, still more usually at least 90% identity and 
most usually at least 95% or 100% identity. 

15 The following stringent hybridization and wasnmg 

conditions will be adequate to distinguish a specinc 
probe (e.g., a f iuorescentiy labeled nucleic acid 
probe; from a probe that is not specific: mcuoaticn 
of the crcbe with the sample for 12 hours at 37°C in 
20 a solution containing denatured probe, 50% fcrmamide, 

2a S3 2, and C . 1 % iw/v) dextran sulfate, followed by 
washing m IX SSC at 7C C C for 5 minutes; IX SS2 at 
3-70^ for 5 minutes; C . 2X SSC at room temperature for 
5 nmutes, and H : .G at room temperature for 5 minutes . 
25 Those cf skill are aware that it will often be 

advantageous m nucleic acid hybridizations {i.e., in 
situ, Southern, or Northern; tc include detergents 
(e.g., sodium dodecyi sulfate; , chelating agents 
ie.a., EDTA; or other reagents [e.g. , buffers, 
30 Denhardt ' s solution, dextran suirate. m tne 

hybridization or wash solutions. Tc evaluate 

specificity, probes can be tested or. host cells 
containing KSHV and compared with the results from 
cells containing non-KSKV virus. 

3 5 

It will be apparent to those cf ordinary ski-- m the 
art that a convenient method for determining whether 
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Solid Sunuorzs, Meinkorh and Wahl ; Analyz-ca- 
Biochemistry (1964) 235, 2c7-2B4 and Innis e: a.., 
Frozoccls (1990) Academic Press, San Diego. 

Target-specific probes may be used in the nuclei acid 
hybridization diagnostic assays for-KS. The prcoes 
are specific for or complementary to the target c: 
interest. For precise allelic differentiations, the 
probes should be about 14 nucleotides long and 
oreferably about 20-30 nucleotides. For more general 
detection of KSHV, nucleic acid probes are about 5 0 tc 
10C0 nucleotides, most preferably about 2CC tc 400 
nucleotides . 



A specific nucleic acid probe can be RNA , DMA, 
oligonucleotide, cr their analogs. The probes may be 
sinale cr double stranded nucleic acid mciecuj.es . Tr.e 
probes of the invention may be syntnesizea 
encvmatically, using methods well known m tr.e art 

ie.c. nio: translation, primer extension, reverse 
nranscrmticn, the polymerase chain reaction, and 

others: cr chemically .'e.g., by methods describee, dv 

Beau-cage an 



Carruthers or Matteueci et . , tuprc 



The probe must be of sufficient lengtn tc able to 

form a stable duplex with its target nucleic acta m 
the sample, i.e., at least about 14 nucleotides, ano 
mav be longer (e.g., at least about 50 or 100 oases m 
lencth'; - Often the probe will be more than about 100 
bases m length. For example, when prooe is prepared 

■ _r nv" -ht ^.-»c^^-e cf labeled 

bv nick- translation c _ DIs^ j... ~n_ . & 

nucleotides the average probe length may oe aoout 10v- 
6 0 0 bases . 

For discussions of nucleic acid probe design and 
annealing conditions see, for example, Ausubel er a_ . , 
suora; Berger and Kimmel, Eds., Methods in Enz;~oiogy 
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An alternative means fcr determining the preserve of 
the human herpesvirus is in sizu hybridization, cr 
mere recently, in sizu polymerase chain reaction. In 
sizu PCR is described in Neuvc ez al . i:??3 
- intracellular localization of PCR- amplified hepa:i-is 

C DKA , in American Journal cf Surgical Pazhcicgy 
17(7), 683-690; Bagasra et al . (1992) Detection cf 
HIV - 1 ore virus in mononuclear ceils by in situ PC?. , in 
New England Journal of Medicine 326 (21 ), 1385 - 12 91 ,- 

10 and Hem ford et al . { 1993 ) Variation in cellular EGF 

re center mRNA expression demonstrated by in sizu 
reverse transcriptase polymerase chain reaction, in 
Nucleic Acids Research 21, 3159-3166. In sizu 

hvbridizaticn assays are well known and are generally 

: c described in Methods Enzymol . Vol. 152, (1987; Berger 

and Kimmel, Eds., Academic Press, New York. In an m 
sizu hybridization, cells are fixed to a solid 
suoport, typically a class slide-. The cells are then 
contacted with a hybridization solution at a moderate 

2C temperature to permit annealing cf targe t - speci f i o 

probes that are labeled. The probes are preferably 
labeled with radioisotopes cr fluorescent reporters. 

The above-described probes are also useful for in sizu 
25 hybridization, or in order to locate tissues which 

■ excress the aene , cr for ether hybridization assays 
fcr the presence cf the gene cr its mRNA in various 
biological tissues. In sizu hybridization is a 
sensitive localization method which is not dependent 
3 0 on expression cf polypeptide antigens cr native versus 

denatured conditions . 

Svnthetio oligonucleotide tcligoi probes ano 
r iboorcoes made from KSHV phagemids or pi asm us are 
2 5 also orovided. Successful hybridization conditions 

in. tissue sections is readily t ransf errable fror. one 
orobe to another. commercially- synthesized 
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a orobe is specific for a KSKV nucleic acid M - e ™* 
utilizes a Southern blot tor Dot tlzz) using 3::A 
crepared from the virus. Briefly; to identity a 
target-specific probe, DMA is isolated from the virus. 
Test DNA, either viral or cellular, is transferred tc 
a solid (e.g., charged nylon) matrix. The probes are 
labeled by conventional methods. Following 
denaturation and/or prehybridizat ion steps known in 
the art, tne probe is hybridized to the immobilized 



10 



Id 



DMAs under stringent conditions, sucn as aetir.ec 
above . 

It is further appreciated that in determining probe 
specificity and m utilizing the method of this 
invention to detect KSKV , a certain amount cf 
background signal is typical. and can easily ne 
distinguished by one of skill from a specific signa- . 
Two- fold signal over background is acceptable . 



>d for detecting tne KSKV polypeptide 



the use of ?C~ and/or dot blot hyorici cat i: 



o o A Dref erred metnoc 

is 

Other methods to test for the presence or absence cc 
vstrv -or detection cr prognosis, or risk assessment 
for KS includes Southern transfers, solution 
25 hybridization or nor.- radioact ive detection system., 

all cf which are well known to tnose of skill in the 
art. Hybridization is carried out using prooes . 
Visualization cf tne hybridized portions allows the 

- , _ . ~ - ^^oconoe or absence 

qualitative aetermma^-on c^ -n_ p- 

3 o of the causal agent . 

^^-^--a.*- r~- reverse 

Similarly, a Northern ^~ 

rranscriptase ?CK may be used for tne detection or 
KSKV messenger RNA in a sample. These procedures ar = 
35 also well known in the art. See Sanbrook e: aJ^ 

(198?) Molecular Closing: A Later a cor;." .\an--^ t^.-- 



i. |, Cold Spring Harbor Laboratory, Vo-s . 1 - - - 
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Alternative immunohistochemicai protocols may oe 

emoioyed which are well known to those skilled :r. tr.e 
ar: . 

E . Immunologic Assays 

This invention provides a method of diagnosing 
Kaposi's sarcoma in a subject, which comprises i a > 
obtaining a suitable bodily fluid sample from the 
subject, (b) contacting the suitable bodily fluid of 
-he subject to a support having already bound thereto 
an antibody recognizing the KSKV polypeptide, so as to 
bind the antibody to a specific KSKV polypeptide 
anticen, (c) removing unbound bodily fluid from the 
sucport, and <d) determining the level of the antibody 
bound by the antigen, thereby diagnosing Kaposi's 
sar coma . 

This invention provides a method of oiagnosmg 
Kaposi's sarcoma in a subiect, which comprises (a) 
obtaininc a suitable bodily fluid sample from cne 
suoject, (b) contacting the suitable oodiiy fluid of 
the subnect to a support having already bound thereto 
the KSKV polypeptide antigen, so as to omo tne 
antigen to a specific Kaposi's sarcoma antibocy, id 
removing unbound bodily fluid from the support, and 
(d- determining the level of the antigen bound by the 
Kaposi's sarcoma antibody, thereby diagnosing Kaposi's 
sarcoma . 
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The suitable bodily fluid sample is 
sample which would contain Kaposi's sarcoma antibody, 
antigen or fragments thereof. A suitable boony rluio 
includes, but is not limited to: serum, plasma, 
cerebrospinal fluid, lymphocytes, urine, transudates, 
cr exudates- In the preferred embodiment, the 

suitable bodily fluid sample is serum or plasma. In 
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= Iigonucieotide probes are prepared using ^ tr.e 
nucleotide sequence of "he identified aer.e . Tr.esi 
probes are chosen for length (45-£3 aers) . men 
content (50-70%) and are screened for uniqueness 
; against other viral sequences in GenBank . 

Oliaos are 3 < end- labeled with E«-»S]dAT? to specific 
ac-ivities in the range of 1 x 10« dpm/ug using 
-enr.ir.ai deoxynuclectidyl transferase, ^-incorporates 
labeled nucleotides are removed froir. the oltgo probe 
bv centrifugatior. through a Sephadex G-25 coiu-.. cr by 
eiution from a Waters Sep Pak C-IB column. 

KS tissue embedded in OCT compound and snap frozen in 
f--Izins isopentane cooled with dry ice is cut at =' um 
Hrt .-vais and thawed onto 3 -amincprcpylt riethcxys nane 
t~-atec slides and allowed to air dry. The slides are 
--v-d in 4% freshly prepared paraformaldehyde and 
.^«d <n water. Formalin- fixed, paraffin embedded KS 
-issues cut at 6 um and baked onto glass slides can 
also be used. These sections are then deparaf f mized 
in xvlenes and rehydrated through graded alconols. 
Prehybridization m 20mK Tris pH 7.5, O.C2% Denhardt ' s 
solution. -C% dextrar. sulfate for 3C mm at 37'C is 
-^-ow=d bv hvbridizacicr. overnight in a solution of 
50% fcrmamide <v/v, , 10% dextran sulfate (w/v, , 20mK 
sodium ohosphate (p.-. 7.4,, ja — 

solution. 100 ug/nl salmon sperm DNA, 125 ug/m- yeas: 
tRNA and -he cligc prone (1C £ cpm/ml) at 42°C 
3C oversight. The slides are washed twice with 3X SSC 

J , ^ n - y cc.~ 15 minutes each at room 

and twice witn —a- 

c-- • = ~ - 7pd bv autoradiography . 
temperature ana \_s-a_-^~-i ^. 

Brief lv, sections are denydrated through graaeo 
Ilconols containing 0.3M ammonium acetate, and air 
^5 d-i-d The slides are dipped in Kcdak IwS2 emu_sion, 

exposed for days to weeks, developed, ^and 
counterstained with hematoxylin and eosm . 
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See Immunoassays above for more derails cr. tr.e 
iranunoreagents of -he invention for use in diagnostic 
assays for KS . 



10 



± D 



2 0 



[V. 



Treatment of Human Herpesvirus - Induced KS 



This invention provides a method for treating a 
subiect with Kaposi's sarcoma (KS) comprising 
administering to the subject having KS a 
pharmaceutical^* effective amount cf an antiviral 
aaent in a pharmaceutical^ acceptable carrier, 
wherein the agent is effective to treat the subject 
with KSHV. 

Further, this invention provides a method or 
prophylaxis or treatment for Kaposi's sarcoma ( KS ) by 
administering to a patient at risk for KS , an antibody 
thai binds to KSHV in a pharmaceut icaliy acceptable 
carrier . 

This invention crovides a method cf treating a subject 
with Kaposi's sarcoma comprising administering to the 
subject an effective amount of an antiser.se molecule 
cacable of hybridizing to the isolated DKA molecule 
of KSHV under conditions such that the antisense 
molecule selectively enters a KS tumor ce_l or tne 
subiect , so as to treat the subject . 
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addition, the sample may be ceils frorr. bone marrow, or 
a supernatant rrom a cei. ^ 

- • .,.i ta i,i« bc'ilv fluid samoie from a 

obtaining a suitable o—ii} ~ 

subject are known to those skilled ir. the art. 
Methods of determining the level of antibody ^cr 
antiaen include, but are not limited to: EL1S-., _.-A, 
and Western blotting. Other methods are known to 
those skilled in the art. Further, a subject infected 
with KSHV may be diagnosed as infected with the above- 
described methods . 



The detection of KSHV and the detection of virus - 
associated KS are essentially identical processes. 
The basic principle is to detect the virus using 
specific iigands that bind to the virus but not to 
other polypeptides or nucleic acids in a normal human 
cell or its environs. The iigands can be nucleic acio 
molecules, polypeptides or antibodies. The Iigands 
car. be naturally-occurring or genetically or 
ohvsically modified, such as nucleic acids with non- 
natural nucleotide bases or antibody derivatives, 
i.e., Fab or chimeric antibodies. Serological tests 
for detection of antibodies to the virus present in 
subject sera may also be performed by using the KSHV 
polypeptide as an antigen, as described herein. 

Samples can be taken from patients with KS or from 
patients at risk for KS , such as kluS patients. 
Typically the samples are taken from blood icei-s, 
serum and/or plasma: cr from solid tissue samples such 
as skin lesions. The most accurate diagnosis for K~ 
will occur 
de 

also be indicated if antibodies tr the virus are 



elevated risers c - tne 



7- "r<» 



peered in che blood cr in involved lesions. *-*S may 



detected and if ether diagnostic factors for KB are 
present . 
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promoting inhibitory mechanisms cf tne cs__s, as 
cromoting RNA degradation. Ir.hic:tcry " u: " :: c: * : 
methods therefore encompass a number of different 
approaches to altering expression of herpesvirus 
aenes. These different :v?es of inhibitory nucleic 
acid technology are described in Helens and Tcuime 
(1990) Sicchim. Biophys. Acta. 1049, 99-125, which is 
~*erred to hereinafter as "Helene and Touime . " 



10 In brief, inhibitory nucleic acid therapy approacr.es 

can be classified into those that target DNA 
sequences, those that target RNA sequences (including 
pre-mRNA and mRNA) , those that target proteins (sens- 
strand approaches), and those that cause cleavage or 

15 chemical mo 



edification of the target nucleic acids. 



Approaches targeting DNA fail into several categories. 
Nucleic acids can be designed to bind to the major 
groove of the duplex DNA tc form a triple helical or 
2C "cnoiex" structure. Alternatively, inhibitory 

nucleic acids are designed to bind to regions of 
sinsle stranded DNA resulting from the opening of the 



guo_ e>; 



DNA during replication or transcription. 



More commonly, inhibitory nucleic acids are designed 
to bind to mRNA cr mRNA precursors. Inhibitory 
nucleic acids are used tc prevent maturation of pre- 
mRNA. inhibitory nucleic acids may oe designed to 
interfere with RNA processing, splicing or 
30 translation. 

The inhibitory nucleic acids can be targeted tc mRNA. 
In this approach, the inhibitory nucleic acids are 
designed tc specifically olocr. translation cf tne 
35 encoded protein. Using this approach, the inhibitory 

nucleic acid can be used to selectively suppress 
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This invention provides an ant is ens e nciecu_e ^=--l- 
cf hybridizing to the isolated nucleic acid r.c.eru.e 
c , KS _ r _ In one embodiment the antisense molecule is 
DNA*. In another embodiment the antisense molecule is 
RNA. In ano: 



>cher embodiment, the antisense molecule is 
a nucleic acid derivative (e.g., DNA or RNA with a 
protein backbone) - 

The present invention extends to the preparation or 
antisense nucleic acids and ribozymes that nay be used 
to interfere with the expression of a polypeptide 
either by masking the mRNA with an antisense nucleic 
rleaving it with a ribozyme, respectively. 



.5 acia c: 

This invention provides inhibitory nucleic acid 
-reraoeutics which can inhibit the activity of 
herpesviruses in patients with KS by binding to the 
20 isolated nucleic acid molecule of KSKY . Inhibitory 

nucleic acids may oe single - stranded nucleic acids, 
which can specifically bind to a complementary nucleic 
acid seouence. 3y binding to the appropriate target 
sequence', an RNA - RNA , a DNA - DNA , or RNA-DKA duplex or 
25 criciex is formed. These nucleic acids are often 

termed "antisense" because they are usually 
complementary to the sense or coding strand of tne 
aene, although recently approaches for use of "sense- 
nucleic acids have also been developed. The term 
30 "inhibitory nucleic acids" as used herein, refers to 

both "sense" and "antisense" nucleic acids. 

By binding to the target nucleic acid, the inhibitory 
nucleic acid can inhibit the function of tne target 
35 nucleic acid. This could, for example, be a resu.c or 

blocking DNA transcription, processing cr pciy i A) 
addition to mRNA , DNA replication, translation, or 
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The targeting of inhibitory nucleic acids-" spermr 
cells of the immune system by con jugs tier. v.itr 
targeting moieties binding receptors cn the surrace c: 
these ceils can be used for. ail of cr.e anrve rrrms c_ 
inhibicory nucleic acid therapy. This invent- or. 
encompasses all of the forms of inhibitory nucleic 
acid therapy as described above and as described in 
Helene and Toulme . 

An example of an antiherpes virus inhibitory nucleic 
acid is ISIS 2322 (ISIS Pharmaceuticals- which has 
activity against CMV (see Biozechnclogy News 14:5) . 

A problem associated with inhibitory nucleic acio 
therapy is the effective delivery of the inhibitory 
nucleic acid to the target cell in vivo and the 
subseauent internalization of the inhibitory nucleic 
atid by that cell. This can be accomplished by 
linking the inhibitory nucleic acid to a targeting 
20 moiety to form a conjugate that binds to a specific 

receptor on the surface of the target infected cell, 
and which is internalized after binding. 



orucis 



ana 



The use of combinations o: ar.tivi^^i 
sequential treatments are useful for treatment cf 
infections and will also 



herpesvirus 

treatment cf herpesvirus - induced KS . 

r-f ~ - v -' - — c . r n f e r t . Di s . 



e useful for the 
For example, 

Snoeck e: a. . (1932) Eur. J 
11, 114 4-1155, found additive or synergistic effects 
against CMV when combining antiherpes drugs (e.g., 

of zidovudine ' L } ' -a:;do - 3 ' - 

c r of K?M? I wit h c t he r 
~reatment of 



come- mat i ons 
deoxy thymidine , AZT. 
foscarnet or acyclovir or 
ant ivirals ; . Similarly, 
cvtomeaalcvirus retinitis, induction witr. ganciclovir 



20 
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translation of mRNA encoding critical proteins . ^ ci 
example, an inhibitory nucleic acid coir.pierr.er. tar;.- t = 
regions cf. c-myc mRNA inhibits c-myc procs... 
expression in a human promyelocyte leukemia ce- 
line, HLSO, which overexpresses the c-myc prc-.c- 
cncosene. Ses Wickscrom ez a.1 . ;19S£: =WA£ E5, -Z2S- 
103 2 " and Harel-3eilan et a! . !1986; Exp. Med. 165, 
2305-2318. As described in Helens and Toulne , 
inhibitory nucleic acids targeting mRNA cave beer, 
shown to work by several different mechanises tc 
inhibit translation of the encoded protein <s 5 . 

The inhibitory nucleic acids introduced intc the cell 
encompass the "sense" strand of the cene ti- 



tan aisc . 

mRNA to trap or compete for the enzymes or binding 
proteins involved in mRNA translation, as aescrioea m 
Heier.e and Touime . 

Lastlv, the inhibitory nucleic acids can be usee to 
induce chemical inaccivation or cleavage of the :arge: 
aenes cr mRNA . Chemical mactivaticn can cccur d\ tne 
induction cf crosslinks between the inhibitory nucleic 
and the target nucleic acid within tne ce.:. 



edifications cf the target nuclei; 



acta a: 

o >-i » J- chemical mc 
25 acids induced by appropriately denvatizec inhibitory 

nucleic acids may also be used. 

Cleavage, and therefore mace ivat ion , or tne _~rge~ 
nucleic acids may be effected by attaching a 

_ ^ - - ^ - ri t-i - i - o ■ nucleic acic wr.;cr. can 
BC suost ltuent -n.— : 

be activated to induce cleavage reactions. The 

_ _. - - - o.. — c ^ - "n — 2: cr.ex.icai , 
substituent can oe one « 

cr enzymatic cleavage. Alternatively, cleavage can 
oe induced by tne use of ribozym.es or catalyt ic RKA . 
25 in this approach, the inhibitory nucleic acids wouio 

oomorise either naturally occurring RNA (rioozym.es; or 
svnthetic nucleic acids with catalytic activity. 
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Infectious Disease Ch. 35, 269, W.3. Sauneers , 
Philadelphia, Pennsylvania) and the li*e. 
immunological therapy will also be effective in -any 
cases :o manage and alleviate symptoms caused by the 
disease agents described here. Ar.:iv:ra, agents 
include agents or compositions that directly cine to 
viral products and interfere with disease progress; 
and, excludes agents that do net impact direct.y cn 
viral multiplication or viral titer. Antiviral agents 
do not include immunoreguiatory agents that a: not 
directly affect viral titer or bind to viral products. 
Antiviral agents are effective if they inactivate tne 
virus, otherwise inhibit its ir.rectivity or 
multiplication, or alleviate the symptoms of K= . 

The antiherpesvirus agents that will be useful for 
treating virus - induced KS can be grouped into broad 
classes based on their presumed modes of action . 
These classes include agents tnat act ( . ± ; by 
inhibition of viral DNA polymerase, 12': by targeting 
ether viral encymes and proteins, -S; by miscellaneous 
incomoiete lv understood mechanisms, or ■' oy 
binding a target nucleic acid -i.e., inhibitory 
nucleic acid therapeutics, supra* Antiviral agents 
may also 'be usee in combination ■; i . e . , together or 
sequentially! to achieve synergistic or accitive 
effects or other benefits. 

Although it is convenient to group antiviral agents by 
3 0 their suoocsed mechanism or action, tne appiic=n_s a~ 

not intend tc be bound by any particular mechanism or 
antiviral action. Moreover, it will be understood by 
these of skill that an agent may act cn more tnan one 

— V- >-" = - i-i-oi-'ef ~ell or throucn 
taraet m a \-_rus o_ \ i 

~ z more than one mechanism. 



z u 



Inhibitors of PF' 



>iJ-- T — — 



WO 97/27208 



PCT/US97/Q14-1: 



2C 



followed by maintenance w:::. foscarne: r.as 
suggested as a way cc naximze efficacy = 
minimizing the adverse side effects cr eitn.r 
treatment alone. An ant i -herpetic ccmpo-t — tr.at 
ccntains acyclovir and, e.g., 2 - ace:y;?yr:2-e - r - ■ 
py-idvlamino) thiocarbonyl ) - thiocarbonchydrazone 
described in U.S. Pat. 5.175,165 (assigned tc 
Burroughs Wellcome Co.). Combinations of T= - 

inhibitors and viral TK- inhibitors ir. ancinerpetie 
medicines are disclosed in U.S. Pat. , ■ 

assigned to Stiohting Rega VZW . A synergistic 
inhibitory effect on EBV replication using certain 
ratios of "combinations of HPMPC with AZT was reporter 
bv Lin en al . (1991) Anzlrr.icrob Agents Che- "her 
25 :244G-3 . 

U.S. Patent Nos . 5,164,395 and 5,021,427 (Blumenkopf; 
o U vrouahs Wellcome) describe tne use -f a 
ribonucleotide reductase inhibitor (an acetylpyridine 
derivative] for treatment of herpes mtections, 
including the use of the acetylpyridine derivative m 

. . - t- ~ t — » - T'j 5,137,724 
combination witn acyclovir. L . ~ - - = ' 

• Balzari e: si. -1990) Moi . Pharm. 37,4C2--; descrioes 
-he use of thymidyiate synthase ir.hibitcrs e . _~ . , 

• -. c . c i .^o- 2 ' - deoxvur icine ■■ in 
f iuoro-uracii a.*c <- 

oombination with compounds having viral thymidine 
kinase inhibiting activity. 



With tne discovery of a disease causal age..: tor r.i 



e therapeutic cr propnyiact — 

protocols " ' 



30 now identified, errectiv 



alleviate or prevent tne r\...?t3X,s of 



noroes virus-associated h~ - — ^ 

the" viral nature of the disease, antiviral agents have 
application here for treatment, sucn as interferons, 
r.ucieoside analogues, ribavirin, a. ..ant acme , ano 

i _ ^. - _ > - ^ — oacetic acid 
pyrophosphate analogues d 

(foscarnet) (reviewed m C-croacr. ez a... -srr-. 
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antiviral drug- The mechanism of acucn cr cerr*—- 
anti -herpesvirus agents is discussed ir. De Cercr 

(1993, Antimicrobial Chemotherapy 32, Supc _ . A, 

- -a — » j and in ether references cite- supra ar.c -.".rro . 

Anci- herpesvirus medications suitable for treating 
viral induced KS include, but are not limited :c, 
nucleoside analogs including acyclic nucleoside 
phosphorate analogs (e.g., pnospncnyl - 

10 methcxyaikyipurines and -pynmidmes ; , and cyclic 

nucleoside analogs. These include drugs such as: 
vidarabine { 9 - /3-D- arabinof uranosyladenine ,- adenine 
arabinoside, ara-A, Vira-A, Parke-Davis*; l-£-D- 
arabmofuranosyluracil { a r a - 1" ■ ; _ - ^ - _■ - . 

13 arabincfuranosyl-cytosme (ara-CJ; H?K?C : -S) -I- [3- 

hydroxy- 2- (phosphonyimethcxy; propyl j cytosme (e.g. , G* 
504. Gilead Science)] and its cyclic for- =;cK?M?C); 
HPMPA { IS) -5 - (3-hydroxy-2-phosphcnylrr:ethcxypropyl) 

adenine. 1 anc its c\ . c i< . i.x-w , . — / 

2Q [ ;s) -9- ( 3 - hydroxy- 2 - phosphonyl met hoxyp ropy 1 : -2,c- 

dianmocurine; ; PMEDA? i 2 -phosphonyl - ne thoxyet hyl ) - 

2, 6-diaminopunne] ; HOE 6C2 [ 2 - ammo- 5 - v-, 3- 

bis : isoprcpoxy^ -2 -propcxymethyi : purine! ; ?XEA ! 9 - ( 2 - 
chosphonvlme thoxyet hyl ) adenine j ; bromovmyi- 
25 deoxyundme Ourr.s and Sandford, 1950, J. Znfecz, 

Dis. 162 : 634 -7) ; 1 - £ -D- aracmcf uranosyl - E- 5 - (2- 

bromovinyl} -uridine or - 2 ' - deoxyur idme ; BVaraU 
2 : -arabincfuranosyI-E-5 - (2 -brcnovmy: . -uracil, 
brovavir, Bristol-Myers Squibb, Yarns a Shoyu;; EVD'J 
2z [ (E) -5- f 2-bronovmyl } -2 ' -deoxyundme', brivucm, e.g., 

Helpin] and its carbocyclic analogue (m wmcn tr.e 
sugar moiety is replaced by a cyclopentane ring; ; 2VDL 
[ ; E / - 5 - \ 2 - i sdovinyi ) - Z ' - deoxyur i dine 1 anc its 

:arbocvc:ic analogue, C-IVDU iBalzarir.: er a-:., 
35 supra); and 5 -mercut ithio analogs of 2 '- deoxyur idine 

(Kclliday and Williams, 1992, An zimi crcr. . Agen-s 
Cheir.czher. 36, 1923); acyclovir ; 9 - ( ; I - 
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Many ant iherpesvirus agents in " li:i:a - 
■ development today are nucleoside analogs oe_ievoc^ 
act through inhibition of viral DNA ^ rep . irat_ z.. . 
especially through inhibition of viral DNA pci^eras-^ 
5 These nucleoside analogs act as a^ terns . - & ■ 

for - he viral DNA polymerase or as competitive 
■nh^bicors of DNA polymerase suostrates. Usually 
- hecc agents are preferentially phosphorylatec oy 
v <«- thvmidme kinase (TKi , if one is present, anchor 
have higher affinity for viral DNA polymerase jnsnjoi 
the cellular DNA polymerases, resulting m select. ve 
antiviral activity. Where a nucleoside analogue is 
incorporated into. the viral DNA, vira. activity ci 

-scroduction may oe afiec.e^ -n - , <* . 

For example, the analogue may act as a cnam 
terminator, cause increased lability -.e.g., 
s . 1 = — Ability to breakage) of analogue - con- amine 
DNl"and/cr impair the ability of the substituted DNA 
to act as template for transcription or replication 
< S ee, e.g., Balzarim sc al . , supra. . 

I- will be Known tc one ~- sk~-_ 

drugs, many of the agents useful for treatment 

neroes virus mreccions a- n~ . 

"activated"; by the host, nest tell, or v:rus-.r..ec,ea 
host cell metabolic enzymes. For example, acyclovir 

j - a---v«» e or-, witr. the 
is criphospr.ory-ated -- s a 

--=t nho.ohcrvl.rien being carried out by the herpes 
virus thymidine kinase, when present. Other^exa^es 

are the reported conversion zz tne r.^_ ^ 

ganciclovir m a -.hree-step r-cabclic pathway ;winkier 
et al., Antiviral Hesearrn 14, an= tne 

. - .^ — -i - = a — * v e fcrm b\* , 
chosohorylation or " -- 

e.c." a OCV nucleotide Kinase. It will oe apparent to 

* e ir--" ■ -ra- specific netabciic capabilities 
cne o t sK wna_ u-.- ^>h" 

cf a virus can affect the sensitivity cf Tihac virus to 
specific drugs, and is one factor m tne onci_e 
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Human Rezroviruses 9, 307-214) ana are 33=:::-*=. 
nucleoside analogs that may be used to treat XS 
exemplary protocol for these agents ;s an intravenous 



1C 



30 



injection of about C.3 5 mg/meter : (0.7 mg, Kg 
weekly or every other week for at least : 
oreferably up to about four to eight weeks. 



Acyclovir and ganciclovir are of interest because or 
their accepted use in clinical settings. Acyclovir, 
an acyclic analogue of guanine, is phosphorylat ed by 
a herpesvirus thymidine kinase and undergoes further 
phosphorylation to be incorporated as a chair, 
terminator by the viral DNA polymerase during viral 
replication. It has therapeutic activity against a 
IS broad range of herpesviruses, Herpes simplex Types 1 

and 2, Varicella- Zoster, Cytomegalovirus, anc 
Epstem-Barr Virus, and is used to treat disease such 
as herpes encephalitis, neonatal herpesvirus 
infections, chickenpox in immunocompromised hosts, 
20 herpes zoster recurrences, CNV retinitis, E3V 

infections, chronic fatigue syndrome, and hairy 
leukoplakia m AIDS patients. Exemplary intravenous 
dosages or oral dosages are 250 mg/kg/rv body surface 
area, every 5 hours for 7 days, cr maintenance doses 
25 cf 200-40C mg IV or orally twice a day tc suppress 

recurrence. Ganciclovir has been shown to be more 
active than acyclovir against some herpesviruses. See, 
e.g., Oren and.Soble, 1951, Clinical Znfsczious 
Diseases 14, 741-6. Treatment protocols fcr 

aanciciovir are 5 mg/kg twice a day IV cr 2.5 mg/kg 
three times a day for 10-14 days. Maintenance doses 
are 5-6 mg/kg fpr 5-7 days. 



Also cf interest is KPMPC . H?M?C is repcrced tc be 
more a: 
the 



ictive than either acyclovir cr ganciclovir m 
rhemotherapy and prophylaxis cf varicus HSv-_, 
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hyarcxyezhoxy 3 merhy- i guanine ; e.g., Zovirax ^urrcugr.* 
Wellcome)]; penciciovir ..- - 

(hyaroxymechyl ! bucyll -guanine ; ; ganciclovir 
dihvdroxy-2 prcpoxymethyl] -guanine/ e.g., -:<■■■= ■ ■ 
Cvrovene (Syr.-exi , DKFG (Seals er a;., 15?:-. 
A.^rijnicrobia-1 Agencs Chemzzher . 37. =lE-2~,; 
^sooroov'iecher derivatives of ganciclovir '.see, e.g., 
Winfceimann e= ai . , 1988, Drug Res. 3S. 154S-154S;; 
cwalovir; famciclovir [2 -amino- 9 - (4 - ace- cxy- .- - 

i 3 ceioxyrr.e:ny 1 ) dul - 1 - > - i P^ri^e - xlta " / 

valacyclovir (Burroughs Wellcome); desciclcvir [ •: 2 - 
amino- 9- < 2 -echoxyrnethyi ) purine ) 3 and 2 -ammo- 9- (2- 
hydroxvethoxymethyl) -9H-purine ( prodrugs or 

acvciovirj ; CDG (carbocyolic 2 ' - deoxyguanosme ; anc 
ourine nucleosides with -he pencar uranosy . nn s 
reciaced by a cycle butane ring (e.g.. cyciobut-A [<-- 

? _ [ i £ , 2 a , 3 £ ) - 2 , 3 - b i s ( h y d r o x y m e - h y 1 ) - 1 - 
cyclooutyij adenine] , cyclobut-G [(-->-?- L-£ , 
2,l--cis(hydrcxymethyl:- - I - cyclobutyl ] guanine J , sH--~ 

1 ' "( R ) - \ i a , 2 £ , 1 Qf - 9 - 

-. - — ■ = -.-: n & ] , and an arrive 

bis ;hyaroxyr.e-ny. ) cyc-c-u. v - ^ ' 

isomer or raceme ^HCu , - ^ ^ 

ammo- 9- [2,3 -bis (hydrcxymethyl • cyc-ocucy_ ■ - - - -p-i — ~ - 
cne ; see, Braitmar. et al . , 19?-, .^;-"^r^. -g 

— - * r • - ^ ^ s > re i"iair. of these 
and Chemotherapy - 4 -"-'-^^ ' 

antiherpesviral agents are discussed m Gorach er al . , 
19 92, Infeczicvs Disease Ch . 3 5 , 259, V, . - . ^aunoers , 

Philadeipnia; saunaer* e- <*i . , — - - 

Deflc. Synzr. 3, 571; Hamanaka «- ■ . 

Pharmacol. 40. 446; and Greenspan er al . , 1991, J. 
Acquir. Irrjr.une 2ef: c . Synnr . 3, — - - 

Triciricme anc triciribme "oncencspnst e =:e p~»-e..- 
inhibitors against herpes viruses. -o:es e- a-, . . 

1994, Antiviral Resear m 2 j , - eve * Vi * 

Cor.f. on Antiviral Research, Abstracr No. 12.. , Supp . 



) , HIV - 1 anc I v - 1 



[Kucera era!., 1992 
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KSV-2 , 7K- K3V , VZV cr CMV ir.fecricns ,r. 
!De Clercc, s-Jtpra) . 

Nucleoside analogs such as BVaraV are ?=•• 

. - -qv ar- VZV tns: have 

inhibitors cr H=>v-_, t3\ , an- v~ 

Ic-virv than acyclovir in ar.itr.al n-.oo.e-F 
^cephalitis. FIAC ( f luroiaoarbincsyl cytostne 
its related fluroethyl and ipdo compounas -.e.g., — , 
z. IA r-. have potent selective activity agamst 
herpesviruses, and K?MPA i (Si - 1 - ! [3 - hycrcxy-C - 
phcsphorylnethoxy] propyl) adenine) nas C '^'_ 

demonstrated to be more potent against HSV ar.o -MV 
than acvclovir cr ganciclovir and are or mcice 

advanced ~ases — ■ _ 
chlorodecxyaaenosine) is an— :e. nu 

known as a mgh±\ sp i- - * 

a immunosuppressive drug) . 

O t ner useful ar.zivira. ager.-a _riw-«a_. 

2' -deoxyuricine derivatives, e.g., ^ =T0U ■. 5 - = i = - 
brcnc-hien-2-yl: -2* -deoxyuridme ] and CD 
-V ---GChier-2-yl ) - 2' -deoxyundine; ; 
de^v^-hydro 

and OXT-G [5- ; 2 - deoxy - 2 - hydr cx>^et hyl - r - _ - ery- nro - 

oxerancsvl; guanine] . Alcnougr. OX--- - 

~, ,^ b -- inq viral D!CA syn-hesis i:s mechanism cr 

, - , ,_;^a-^>« ,T, has c anc corner 

action has net yet beer, eiu^i-a.e-- -—S- 

coxpounds are described in Andrei e= ai . , 125- Sur • 

j Clin. wicrocioi. - - — — - • ■ - ' 

"... , „,,^^ e ierzvazives useful m 
Aaaitionai a..---.--*- . 

treaimg herpesvirus i^e-— .s a-- 

?at . = ,iOB.95-i (assigned oeecharr. Group . . . . - ~~ 

. . ■ - ? ■ . ■ 1 e Wellcome; 
v s --"i3X\Tur:ne aracmcsioe :ar a -K r-~ 

. . - ^- -■ ~s - virus, and 

is a po-eni ir.nibicor ci -.«r-c^ — 

will be useful for -rea:mer.: r r 
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ii; ^-n-r Antivirals 

Although applicants do not intend to re ocur.c ry^ = 
particular mechanise, cf ar.tivirs. action, 
'an- iheroes -virus agents described above are believed 
to act through inhibition cf viral SNA polymerase. 
However, viral replication requires net only the 
replication of the viral nucleic acid but alsc tr.s 
oroduction cf viral proteins and ether essential 
comoonents. Accordingly, the present invention 

contemolates treatment of KS by the inhibition cf 
v^ral obliteration by targeting viral prctems ctner 
ohan DNA polymerase (e.g.. by inhibition cf their 
svnthesis cr activity, or destruction cf vtra. 
orotems after their synthesis;. F=r exaxp.e , 

administration of agents that inhibit a viral serine 
c-otease, e.g., such as one important m development 
cf the viral capsid will be useful in treatment or 
viral induced :-'.E . 

Other viral enzyme targets include: OMF de carboxylase 

inhibitors ia target of, e.g., parazcf urir. , , C. 

svnthetase inhibitors ■targets cf. 

cyclcpentenyicytcsine , IM? dehydrogenase . 

ribonucleotide reductase ia target cf. e.g., carccxyl- 

cor.taining N-alkyldipeptides as described m U.S. 

Patent No. 5,110.799 '.Tolisar. e: al . , Mercki). 

chvmidine kinase !a target of. e.g., l-!2- 
( h y d r o x y m e t h y 1 ; c y = 1 oa 1 ky 1 ne t hy 1 3 - 5 - s ubs t i t u t • d 
-uracils and -guanines as described in. e.g., '_.£. 
Patent Nos . 4.663, 9" and 4,- = :,0£C 'Tcimar. e: a:., 
v.erck) as well as other enzymes. It will be apparent 
-o one cf ordinary skill in the art that there are 
additional viral proteins, bsth characterized and as 
yet to be discovered, that can serve a- target for 
antiviral agents. 
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Kutapressin is a liver derivative available 
Schwarz Parma of Milwaukee, Wisconsin, in a: 
form of 25 mg/ml . The recommended dosage for 
5 herpesviruses is from 200 to 25 mg/ml per day for an 

average adult of 150 pounds. 

Pcly(I) -Poly (C ir U) , an accepted antiviral drug known as 
Ampligen from HEM Pharmaceuticals of Rockville, I'iD has 

10 beer, shown to inhibit herpesviruses and is another 

ant iviral agent suitable for treating KS . Intravenous 
infection is the preferred route of administration. 
Dosages from about 100 to 600 mg/m : are administered 
two to three times weekly to adults averaging 150 

15 pounds. It is best to administer at least 2C0 mg/nr 

cer week . 

Other antiviral agents reported to show activity 
acainst herpes viruses (e.g., varicella zoster and 
20 herpes simplex) and will be useful for the treatment 

of herpesvirus-inauced KS include mappicine ketone 
ISmithKline Beecnam) ; Compounds A, 79296 and A, 72209 
(Abbott) for varicella zoster, and Compound 652C87 
(Burroughs Wellcome - i see, The Pink Sheet 55:20) May 
25 17, 1993) . 

Interferon is known inhibit replication of herpes 
viruses. See Oren and Soble , supra. Interferon has 
known toxicity problems and it is expected that second 
3C veneration derivatives will soon be available that 

will retain interferon's antiviral properties but have 
reduced side affects. 

It is also contemplated that herpes virus - mcucec KS 
25 may be treated by administering a herpesvirus 

reactivating agent to induce reactivation cf the 
" at*=nt virus. Preferably the reactivation is comomec 
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As used herein adminiscrat ion means a ^e:":- 
administering to a subject. Such methods are ws.. 
known to those skilled in the art and inc-uue, :u: are 
not limited to , administration :cz::ai:y. 

oarenterally , orally , intravenously, in:raT.us:uIsr.y ( 



subcutaneously or by aerosol. Adr.ir.isira - ion :: tne 
agent may be effected continuously or intermittent ly 
such that the therapeutic agent in the patient is 
effective to treat a subject with Kaposi's sarcoma or 
a subject infected with a DNA virus associated with 
Kaposi's sarcoma. 

The antiviral compositions for treating Herpesvirus - 
induced KS are preferably administered :: numan 
patients via oral, intravenous or parenteral 
administrations and other systemic forms. Those of 
skill m the art will understand appropriate 
administration protocol for the individual 
compositions to be employed by the pnysman. 

The pharmaceutical formulations or compositions of 
this invention may be m the dosage form or sciio, 
semi-solid, or liquid such as, e.g., suspensions, 
aerosols or tr.e like. Preferably the compositions are 
2 5 administered in unit dosage forms suitable tor smg.e 

administration of precise dosage amounts. Tne 
compositions may also include, depending or. the 
formulation desired, pharmaoeuticaliy-acceptable , non- 
toxic carriers or diluents, which are defined as 

2 c vehicles commonly used t o rcrmu.ate pnarmacej = - 

compositions for animal or human administration. Tne 
diluent is selected so as not to afreet tne cioiogicai 
activity of the combination. Examples or suc.t 
diluents are distilled water, pr.ysioiogi ca • saline, 
25 Ringer's solution, dextrose solution, and Hank's 

solution. In addition, the pharmaceutical composition 
or formulation may also include other carriers, 
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with simultaneous or sequential administra t ;:r. cf an 
anti -herpesvirus agent. Controlled reactivation, over 
a short period cf time or reactivation, m the cresence 
of an antiviral agent is believed to mini-ice the 

5 adverse effects of certain herpesvirus mfeot s 

ie.g., as discussed in PUT Application w;- ?: o-.€B2'.. 
Reactivating agents include agents such as estrone- 
phorbol esters, forskoiin and .^-adrenergic blcc'"-- n - 
agents. 
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Agents useful for treatment of herpesvirus infections 
and for treatment of herpesvirus - induced :;s are 
described in numerous U.S. Patents. For examnle, 
ganciclovir is an example cf a antiviral guanine 
acyclic nucleotide of the type described m US Patent 
Kos . 4,355,032 and 4,603,219. 

Acyclovir is an example of a class of antiviral curme 
derivatives, including r - { 2 - 

hydroxy ethyl methyl ) adenine , of the type described m 
U.S. Pat. Nos. 4,257,256, 4,294,521 and 4 , 1 5 f . 574. 



Brivudin is a: 
derivative cf 
4,424,211. 



example of an antiviral deoxvuridine 
tne type described in US Patent No. 



Vidarabme is an example cf an antiviral curiae 
nucleoside of the type described m British ?at . 
1, 155, 29C . 

Brovavir is an example cf an antiviral decxvuridme 
derivative of the type describee m US Patent Ncs" 
4,54 2,210 and 4,3 6 6,075. 

3d EHCG is an example cf an antiviral carbocvcli- 

nuoleoside analogue of the type described m US Patent 
Nos. 5,152,352, 5,034,394 and 5,126,545. 
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HPMPC is an example cf an ar.;:vira- pr.cspr.^... - 

mechoxyaikyl derivative with of the type described ir. 
US Patent Nc . 5,142,051. 

- CDG (Carbocyclic 2 ' - deoxyguanosine ; is an ex amp _ e c: 

an antiviral carbocyclic nucleoside analogue of the 
type described in US Patent Nos . 4,54 5,255, 4,555,466, 
and 4 , 894 ,456 . 

1C Foscarnet is described in US Patent No. 4,553,445. 

Trifluridine and its corresponding ribonuoleos ide is 
described in US Patent No. 3,201,387. 

15 U.S. Patent No. 5,321,030 C Kaddurah - Daouk e: ai. ; 

Anira) describes the use of creatine ar.a_ogs as 
an ii herpes viral agents . U.S. Patent No. 5,3 06,722 
(Kim e r al . ; Bristol -Meyers Squibb: describes 
thymidine kinase inhibitors useful for treating HSV 

2 0 infections and for inhibiting herpes thymidine kinase. 

Other ant iherpesvirus compositions are described in 
U.S. Patent Nos. 5,266,64? and 5,035,70 8. (Konishi ez 
3.1., Bristol -Meyers Squibb) and 5,175,165 (Biumenkopf 
er al . ; Burroughs Wellcome) . U.S. Patent No. 

2= 4 , S 5 0 , 5 2 C ■; Ashe on er al . , Merck- describes the 

antiherpes virus agent I S ) - r - : 2 , 3 - dihydroxy - 1 - 
propoxymethyl t guanine . 

U.S. Patent No. 4,703,925 (Suhadoinik er al . , Research 
30 Corporation: describes a 3 ' -decxyadenos me compouno 

effective in inhibiting HSV and E5V . U.S. Patent Nc . 
4,356,076 (Machida et al . . Y a mas a Shoyu Kabushiki 
Kaisha; describes use cf 

:;tt; ( _5_ ( 2 - halogenovinyl - arabinof uranosyluracil as an 

3 5 ant iherpesvirus agent. U.S. Patent Nc . 4,340,5?sr 

JLieb e: al - , Bayer Akt lengeseiischaf t "■ describes 
chosphonchydroxyacetic acid derivatives useful as 
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aiiuvanrs; or nontoxic, r.onthsrapeu:i: ( ncr;;T,-ur^==r.- . 
stabilizers and the like. Effective amounts cf such 
diluent cr carrier are these amounts which are 
= "fec:ive tc obtain a pharmaceut tc=.i_ ^ = — = 
5 formulation in terns of solubility or components, cr 

biological activity, etc. 

y . immimoioaical Approache s r r. Therarv 

10 Having identified a primary causal agent of :-: = in 

humans as a novel human herpesvirus, there are 
immunosuppressive therapies that can modulate tne 
immunologic dysfunction that arises from tne presence 
cf viral-infected tissue. In particular, agents that 

15 clock the immunological attack of the viral - infected 

cells will ameliorate the symptoms of XS and/or reduce 
disease progression. Such therapies include 

antibodies that prevent immune system targeting of 
^-i-=", -infected cells. Such agents include antibodies 

2C which bind to cytokines that otherwise upregulate the 

immune system in response :c vira_ miection. 

drr.ini stereo tc a patient either 



cocktail containing two cr more 



The antibody may oe a am: 
singly or m 

25 antibodies, other therapeutic agents, compositions, or 

the like, including, but not _imitec tc, immuno- 
suppressive agents, potentiators and side-ezrecc re- 

____ ^ - _ ^ _ , — iar ^ t — .=> >-o c; - are immune - 
lieving agents. ^_ pa_ a.uiar 

suppressive agents useful in suppressing allergic re- 
30 actions of a host. Immunosuppressive agents of inter- 

est include prednisone, prednisolone, D--A-jnOr; • ;,:erc> " 
Sharp « Dohme, West Point, ?A. . cyclophosphamide, 
cyclosporin , 6 -mercaptopurme , methotrexate , 

azathioprme and i.v. gamma gioou_in or tnei: 
35 combination. Potentiators of interest mc^uae 

monensin, ammonium chloride and chioroqume. A_ i c: 
these agents are administered m generally accepted 
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efficacious dose ranges such as chose disclosed :r. tne 
Physician Desk Reference, 41st Ed. (I9S7; , Pubiishe: 
Edward R. Barnhart , New Jersey. 

5 Immune globulin from persons previously inferred with 

human herpesviruses or related viruses can be obtained 
using standard techniques. Appropriate titers of 
antibodies are known for this therapy and are readily 
applied to the treatment of KB. Immune globulin car. 

1C be administered via parenteral injection or by 

intrathecal shunt. In brief, immune globulin 

preparations may be obtained from individual donors 
who are screened for antibodies to the KS- associated 
human herpesvirus, and plasmas from high- titerec 

15 doners are pooled. Alternatively, plasmas from donors 

are pooled and then tested for antibodies to the human 
herpesvirus of the invention; high-titered pools are 
then selected for use in KS patients . 

20 Antibodies may be formulated into an injectable 

-reparation. Parenteral formulations are known and 
are suitable for use in the invention, preferably for 
i.m. or i . v . administration. The formulations 

containing therapeutically effective amounts of 

23 antibodies cr immune toxins are either sterile liquid 

solutions, liquid suspensions or lyophilized versions 
and optionally contain stabilizers or excipients. 
Lyophilized compositions are reconstituted with 
suitable diluents, e.g., water for infection, saline, 

3 0 0.3% alycine and the like, at a level of about from 

.01 mg/kg of host body weight to 10 mg/kg where 
appropriate. Typically, the pharmaceutical 

compositions containing the antibodies cr immunotoxms 
will be administered in a therapeut i ca_-y enective 
35 dose in a range of from about .01 mg/kg to about r 

mg/kg of the treated mammal. A preferred 

iheraneut icaliy effective dose of the pharmaceutical 
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in a range cf frw. about O.CI rr.g/kg to about C.5 mg k = 
bcdy weight of the treated m^I a - T --» : stereo cver 
several days to two weeks oy daily ;r.:r S vsr.:u£ 
infusion, each given over a one ncur per too, in a 
secuencial patient dose -escalac ion regimen. 

Antibody may be administered systemicaliy by infection 
i.m., suocucanecusiy cr intraper icor.eai ly or oireotiy 
into KS lesions . Tne dose will be dependent upon the 
properties of the antibody or imrnunotcxm employed. 
^ _ ^ zs activity and bioiogica. halr-_ite, 



- ia c - r w 



concentration of antibody m the fcrmulat ion , 
and rate cf dosage, the clinical tolerance cr tne 
oacient involved, the disease afflicting tne patten, 
and the like as is well witnm tne = k- 1 - 
o hvs i c i an . 



JiV — 

-* of the s o _ u 1 1 ■; 



Tne antiocdy of the present -invention .,ay oe 

admmisterec m s^u^o... 

should be in the ranee ot ?r. - to ;? . - , -■- = . .- 

The antibody or derivatives thereot 

. , ravir.c a suitable 

should oe m » s -- iC - • - 

• - , s--- as ohesohate, 

:eutica__v accept&c_e o =»- — — - 



aminome thane - HCl cr ^itrat^ ano 



pharmac 

t r i s { hy d r cxyme t hy 1 ) 
the like. Buffer concentrations shou.c oe m tne 

- , -nr- nN- -he solution of antiocdy may 

ranae o^ i ^ - uu • • ^ 

also contain a sair. such as sodium chloride or 
DO-assiun. chicride in a cor.cer-a^ion of 50 tc 150 «nK. 

cabilizmg agent su.r. as an 



:lcbuim, a gelatin, a protamine or a salt 



An effective amount ci a s' 
albumin, a g. 

oz Protamine may also be included and may be added to 
a solution containing antibody or immunotcxm cr tc 
the composition from which the solution is prepareo . 

- ^ of a^-tibodv is made dai^y, 

Svstemic aains.ra.io.. Oi. - 



aene rally oy 



intramuscular injection, a^tnougn 
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intravascular ir.fus-on is acceptable. Aan:n:5:ra:;:' 
may also be intranasal cr by ether ncnparer.cera- 
routes. Antibody cr immunotoxm -ay a_sr re 

administered via microspheres, liposomes cr ctner 
5 microparcicuiace delivery systems placed ir. certain 

tissues including blood. 

In therapeutic applications, the dosages cr compounds 
used in accordance with the invention vary depending 

10 on the class of compound and the condition oemg 

created . The age, weight, and clinica- ccnc.it ion c: 
the recipient patient; and the experience and j udgmenc 
cf the clinician cr practitioner administering the 
therapy are among the factors ar reccing tr.e s = .e-_e-- 

15 dosage. For example, the dosage of an immuncg-oouiin 

can range from about 0.1 milligram per kilogram of 
body weight per day to about 10 mg/kg per day for 
colvcional antibodies and about 5% to about 10% or 
that amount for monoclonal antibodies. Ir. sucn a 

2 0 case, the immunoglobulin can be administered once 

daily as an intravenous infusion. .Preferably, the 
dosage is repeated daily until either a therapeutic 
result is achieved or until side effects warrant 
discontinuation of therapy. Generally, tr.e dose 

25 should be sufficient to treat or ameliorate symptoms 

cr signs of KS without producing unacceptable toxicity 
to the patient. 

An effective amount of the compound, is tnat whicn 
3C provides either subjective relief of a symptom ( s } cr 

an objectively identifiable improvement as noted by 
the clinician or ether qualified observer. The dosing 
range varies with the compound used, the route or 
administration and the potency cf the particular 
2 5 compound. 
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This invention provides substances su::arl: tor ^s^as 
v fcr the rrever.tion of KS ano -.etncos = 

va — — - _ _ 

administering -.hex. The vaccines are cirectec = g=,-~^- 
KSHV and n;os: preferably comprise antigens tctair.ec 
from KSHV In one embodiment, the vaccine ctr.cams 
£-or Ua :5d KSHV. In another embodiment, tne 
—-tains killed KSHV. In another embodiment, 
vaccine contains a nucleic acid vector encoding a 
oolvoeotide. in another embodiment, the vaccine 
subur.it vaccine containing a KSHV pclypept 



* a _ . 



1 => c_ 

de . 



~ris invention provides a recombinant KSHv viru= v-i._. 
a gene encoding a KSHV polypeptide deleted ircm tne 
aenome. The recombinant virus useful as an 

attenuated vaccine to prevent KSHV infection. 

- <~ — v«ntion orovides a methoc ' ' vactm^ - a 
subject against Kaposi's sarcoma. sen;.- rising 
administering to tne subject an effective s,.cunt cr 
oectide or polypeptide encoaeo -n_ a--- 

^ - . - - <=. ^-<=>hv 

. - - ^ -> cr--ab e accent ac^-rr < - 

molecule, a-.c a su-^aj a - 

_ „ - embodiment na:-:eo DNA 

vaccinating tne su^e-. ... c- - 

no subject m an efrective amoan. 

is acrr.i- — Si_e_eu - 

to vaccinate the subject against Kaposi's sarcoma. 

This invention, provides a method of imxum = ir.g a 
subject aaainst disease caused by KSHV Wr.icn cc.prises 
administering to the subject ar. effective immunizing 
acse cf ar. isolated herpesvirus subur.it vaccir.e . 



^ . Vaccines 

. r- & -y >~ o ° "~ ~ " d e O ^" 
T-h= vaccir.e can be made using sy.^.;---- 

reccmbinantly-produced polypeptide described acove as 



antigen. 



TVDicaily, a vaccine will include from aoout 
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I zo 50 micrograms of antigs-. Mere preferably, tr.e 
amount of polypeptide is from about 1= to accut 4 = 
micrograms. Typically, the vaccine is formulated sc 
that a dose includes about 0.5 milliliters. The 
5 vaccine may be administered by any route known m the 

art. Preferably, the route is parenteral. More 
preferably, it is subcutaneous or intramuscular. 

There are a number of strategies for amplifying an 
10 antigen's effectiveness, particularly as related to 

the art cf vaccines. For example, cyclization cr 
circuiarization cf a peptide .can increase the 
peptide's antigenic and immunogenic potency. See U.S. 
P£ - 1 5,001,049. More conventionally, an antigen 

15 can be conjugated to a suitable carrier, usually a 

c^orein molecule. This procedure has several facets. 
It can allow multiple copies of an antigen, such as a 
peptide, to be conjugated to a single larger carrier 
molecule. Additionally, the carrier may possess 
20 properties which facilitate transport, binding, 

absorption cr transfer cf the antigen. 



For parenteral administration, such as subcutaneous 
m-ecticn, examples of suitable carriers are the 

25 tetanus toxoid, the diphtheria toxoid, serum albumin 

and lamprey, or keyhole limpet, nemocyanin because 
they provide the resultant conjugate witn minimum 
genetic restriction. Cony-gates including these 
universal carriers can function as T cell clone 

30 activators in individuals having very different gene 

sets . 

Tne conjugation between a peptide and a carrier can be 
accomplished using one of the methods known m the 
35 art: . Specifically, the conjugation can use 

bifunctionai cross- linkers as binding agents as 
detailed, for example, by Means and Feeney, "A recent 
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colypeptides fro- the human herpesvirus. ror example, 
ir is known in the protein arc that certain a-:r.: acic 
residues car. be subs-i.ucsd with ammo arias cr 
similar size and polarity without an ur.au e street upon 
the biological activity cf tne protein. The human 
herpesvirus polypeptides have significant tertiary 
structure and the epitopes are usually conf ormat i ona_ . 
Thus, modifications should generally preserve 
conformation to produce a protective immune response. 

E . Antibody Proohvlaxis 

Theraoeutic, intravenous, polyclonal or moncc_onai 
antibodies can been used as a mode of passive 
i rr.mun other a oy of herpesviral diseases including 
perinatal varicella and CMV . Immune globulin from 
persons previously infected with the human herpesvirus 
and bearing a suitably high titer cf antibodies 
against the virus can be given in combination witn 
an - iviral agents ( e.g. ganciclovir) , or in combination 
with ether modes of immunotherapy tnat are currently 
being evaluated for the treatment of KS , whicn are 
taraeted to modulating the immune response (i.e. 
treatment with copolymer - 1 , ant i idiotypic monoclonal 
25 antibodies, T cell "vaccination"). Antibodies to 

human herpesvirus can be administered to the patient 
as described herein. Antibodies specific for an 
epitope expressed on ceils infected with the human 
herpesvirus are preferred and can be obtained as 
30 described above. 

A polypeptide, analog or active fragment can be 
formulated into the therapeutic composition as 
neutralized pharmaceut icaliy acceptable salt forms. 
35 Pharmaceutical^ acceptable salts include the acid 

addition salts (formed with the free amine groups of 
the polypeptide or antibody molecule) and which are 
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• _ _ - — - c - a d cv sr.'.* c — * — * ~ 

7 he vaccines may d-m - - - ~ - - ■ - 

method for the ad^niscrari" cf vaccine* J- * '- 
~ ra l and parenteral ie.?-, suncu _«neo -~ y *" _ 

muscula- injection. Intramuscular adT.ir.;,"- rat _c^ 
^^f» r -*»d The treatment may consist cf « = 
~--"va~ "i-e cr a plurality cf doses ever a period cf 

°~~ T . < s ^r-"'erred that the dose be giver, -c a 

time. l - s ~ _ _ . 

human patient within the first 6 months c: -ire. ^ 
antigen cf the invention car. oe com^.ne^ 

^ ^_^ a -- doses cf compounds inc-ucing .= 

!!!!mr" S u=h as influenza type A an-.iger.s. Also, 
" an - i3er . could be a component of a reco-cmant 
~~cine "which could be adaptable for cra. 
adr.mis-.re:ior. . 

vaccines cf the invention may be combine* with ccher 
vaccines for ether diseases to procuce — 

^-..-^^-""v s-fective amcu — 

vaccines. A pr.armaceu-j.wa — > - 

a .-a-n can be employed with a pnarm. _e - ^ 

a — —able carrier such as a prceir. c. r. 

f _ vaccination or mamma. =. -a . 

Other vaccines may be prepared ac -- 

will -known to those skilled in the art. 

Those of skill will readixy recc,..i- - 
n-c-ssarv to expose a mammal te appropriate ep.-_?e= 
in order to elicit effective i.muncprctect icr. . .ne 
ecitooes are typically segments or amino 



_ rr , cn cf the whole protein. Using 

are a snaU ^ - , 

r~ c -«= r-r-re to alter a natural 
30 recombinant genetics, i. ^ derivatives 



protein's primary structure - , 

„ ^—r>«s that are identical ■ o. 

emoracing e^-^wp-s -.1 _ ,~ ldr «- 



that are identical 
substantially the same as ( immunc-ogicaixy 



rn* naturally occurring ^ lL ^ s ' 



,ude peptiae --c»y»i-- ac 

. » 1 - ■ ^ csttt ammo 

substitutions, amin ____ ^ 

additions of the ammo acid sequence .or — •-- - 



derivatives may include peptide fragments, amino 

acid deletions ana amm^ u 
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formed wi-r. inorganic arias sucr. a^ 
hvdro chloric c 



ohcsphcric acids, cr sue- rrga..i- 



acids as aceuc tartaric, m_r.ci 



like. Sales formed :ror, :ne free carccxy. group* 
also be derived frorr. inorganic bases su:h as. rcr 
6X anDle, sodium, potassium ammonium, calcium, cr 
ferric hydroxides, and such organic bases as 
^propylamine, rrimethylamine, 2-echylammc ecnanci, 
hiscidine, procaine, and the like. 

Z Monitoring Therapeu tic Efficacv 

-his invention provides a method for monitoring tne 

--Ir-a— ^c efficacv of treaimer.: for Kaposi's sarcoma 

which comenses: (a) determining m a first sample 

from, a subiecc with Kaposi's sarcoma the presence or 

the isolated nucleic acid molecule; (c! administering 

to the subject a therapeutic amount of an agent sucn 

that the agent is contacted to tne ceil ir. a sample; 

o 0 determining after a suitable period of time the 

amount of tne isolated nucleic acid molecule in tne 

second sample from the treated subject ;^ and ;c, 

roX pa^^c the amount of isolated nucleic acid mo^ecuie 

determined in the first sample with tne amount 

■ - n *-h» s°cono samo-6, » - Ai 

2E ae t ermmeo in ^-^^ 

~ „ _ ,~ ^ r n a aio*"" **herebv 

indicating the e:f ectiveness O- t.:= ag — - ^ - 

TO-rorinc the therapeutic efficacy of treatment for 
Kaposi's sarcoma. As defined hereir. "amount" is viral 
load cr copy number . Methods cf determining virai 
30 load cr copy number are known to those skilled ir. tne 

art . 



;^ PO ninc a^avs For gh^areut icals 

znieviatir.c; th^ g^- on - q c - KS 



C a — s a — - — * 



or 



Si" — an agent involved in tne 
regression of KS has been identified and descrioed, 



o 
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assays directed tc identifying ?"- 5 " :ial 
pharmaceutical agents that inhibit the bic_cgica- 
accivity of the agent are possible. KS drug screening 
assays which determine whether cr not a drug has 
activity against the virus described herein are 
concemclated in this invention. Such assays comprise 
incubating a compound to be evaluated for use m KS 
treatment with cells which express the KS associated 
human herpesvirus polypeptides cr peptides and 
determining therefrom the effect of the compound on 
the activity of such agent. In vitro assays m which 
the virus is maintained in suitable cell culture are 
preferred, though in vivo animal models woulc aiso be 
effective . 

Compounds with activity against the agent of interest 
or peptides from such agent can be screened m in 
v v zrc as well as in vivc assay systems. In vitro 
assays include infecting peripheral blood leukocytes 
or susceptible T ceil lines such as MT-4 with the 
aaent of interest in the presence cf varying 
concentrations cf compounds targeted against vira. 
replication, including nucleoside analogs, chain 
terminators, antisense oligonucleotides and random 
polypeptides (Asada et al . , 1999, J. Clin. Microbiol. 
27, 2204; Kikuta et al . , 19S9, Lancet Ocz . 7, S61). 
Infected cultures and their supernatant s can be 
assayed for the total amount of virus inducing tne 
presence of the viral genome by quantitative ?CE, by 
dot blot assays or by using immunologic methods. For 
example, a culture cf susceptible cells could be 
infected with XSKV m the presence cf various 
concentrations of drug, fixed on slides after a period 
of days, and examined for viral antigen by indirect 
immunofluorescence with monoclonal antibodies to viral 
polypeptides (Kikuta et al . , supra). Alternatively, 
chemically adhered MT-4 ceil monolayers can be used 
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Vaccines 
beer. 



agair.sc a number cf ::.e Herpesvirus^ 
successfully developed. Vac:ir.e= ag 



Varicella-Zoster Virus using a live 6 ::er.u5:e: Or: = 
s--ain is effective in preventing herpes zoster in -he 
elderly, and in preventing chickenpcx in both 
immunocompromised and normal children (Hardy, :.. ez 

1C ai., 193C. Inf. Dls - Ciin. iV. Ainer. 4. 159; Hardy. I. 

s , a: # 19 o li New Engl. J. Med. 325, 1545 ; Levin, K . - 
e: ai. # , 1992, J. Jnf. Dis. 166, 253; Gershon, A. A.. 
^cg 2 , J. Ir.f. Des. 166(Suppl}, 563. Vaccines against 
Heroes simplex Types 1 and 2 are also ccmmer cia_±y 

15 available with some success m protection aaamst 
primary disease, but have been less success:--- m 
creventmc the establishment cf latent infection m 
sensory aangiia (Roizman, h . , 1951, r.ev . x — ■ 
13;Suppi- Hi, S852; SKinner, . r. . =r - ~- - , 

2 0 A.'icrobicl . I.Tjnunoi . 16 C, 3 05) . 

Vaccines againsr KSHV car. be made iron the KSHV 
envelope giycoproteir.s. These polypeptides can be 
rurified and used for vaccination iLasfcy. L.A.. 1550, 

- -> -> wjr "r- - ». r- nuclides iron": 
2S J. Med. Viro^. =9<- M*-_ -~i~a_.* 3 p~ — - 

:ar , s infected with the human herpesvirus can oe 
identified for vaccine candidates per the methodology 
of Marioes, et ai . , 1991, Eur. J. J wane 1 . 21, 2963- 
2970. 

3 j 

The KSKV antigen may be combined or mixed with various 
solutions and other compounds as is known ir. the art. 
?or example, it may be administered in water, saline 
~r "-uffe-ed vehicles with or without various adjuvants 
cr immunodi luting agents. Examples of such adjuvants 
or aaents include aluminum hydroxys . aluminum 
Dhcsphate, aluminum potassium sulfate laiuir... . 
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bervilium sulfate, silica, kaolin, carbcr. , water- ir.- 
oil emulsions, cil-in-water emulsions, muramy _ 
diceptide, bacterial endo:ox:r. ( : '- ■ 

Corynebacterium parvum ( Propionibactenum acnes , 
Bcraetelia pertussis, polyribonucleotides, scdium 
alginate, lanclin, lysolecithin , vitamin A, saprr.ir., 
liposomes, ievamisole , DEAE-dextran , blocked 

copolymers or other synthetic adjuvants. Such 
adjuvants are -available commercially fro- various 
sources, for example, Merck Adjuvant c5 (Merck and 
Company, Inc., Ran way, N.J.} or Freund's Incomplete 
Adjuvant and Complete Adjuvant (Difco Laccratcnes , 
Detroit, Michigan). Other suitable adjuvants are 
Amphigen i oil - in- water ) , Alhydrogei i aluminum 
hydroxide), or a mixture of Amphigen and Alhydrogei. 
Only aluminum is approved for human use. 

The proportion of antigen and adjuvant can be variec 
over a broad range so long as both are present m 
effective amounts. For example, aluminum hydroxide 
can be present in an amount of about 1.5% or tne 
vaccine mixture (AIX, basis). On a per-dose basis, 
the amount of the antigen can range from about C . 1 ug 
to about 100 ug protein per patient. A preferable 
range is from about i Mg ~- about 50 ug per dose. A 
more preferred range is about 15 ug " about 4 5 ug . 
A suitable dose size is about 0.5 mi. Accordingly, a 
dose for intramuscular injection, for example, would 
comprise 0.5 ml containing 4 5 ug 9 f antigen m 
admixture with C . 5% aluminum hydroxide. A.fter 
formulation, the vaccine may be incorporated into a 
sterile container which is then sealed and stored at 
a low temperature, for example 4 3 C , or it may oe 
f reeze-dried . Lyophiiization permits long-term 

storaae in a stabilized form. 
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£ cr an infectious ager.: assay using :r -- :r::: 
inmuncf luorescent antibody staining tc s-: = r_- 
fccus reduction CHigashi ec a-., 19S5, J- ------- 

2 7 , 2 2 0 4 ) - 

As an alternative to whole cell in v;:r: assays, 
ourified KSHV enzymes isolated from a host ce_- or 
Produced by recombinant techniques can be used as 
targets for rational drug design to determine tne 
effect of the potential drug cn enzyme activi 
enzym.es amenable to this approach include, bu 
limited ' 
t hymi dy - a t e 
polymerase . 

effect on the agent itself- 



are net 

dihydrofolate reductase '■ ~HF?; : , 
synthase (TS) . thymidine kinase cr DNA 
A measure of enzyme activity mcicates 



Drua screens using herpes viral products are known and 
have been previously described in 0 5-*---- - n-=ipe = 
croteases) and WO 94/04920 ("013 gene product,. 

This invention provides an assay for screening anti-KS 
chemotherapeutics. Infected cells can be incubated in 
the presence of a chemical agent that is a potential 

■ _ _ ^ _ , _ - v c ( - a aevcio - auancs me ) - 
chemotherapeutic aga.ns^ Kb ie.^. , <a-*-~^ _ 

The 

after several days by 

antigens, Soutnern blotting for viral genome UNA or 



ie level cf virus m the ceils is tnen aetermineo 

immunofluorescence assay tor 



Northern blotting for mRNA and comparea to 
cells. This assay can qui 
chenica 



ckly screen large numbers or 
1 compounds that may be useful against KS . 



Further, this invention provides an assay system tnat 
is" employed to identify drugs or other molecules 
-apabie of binding to the nucleic acid molecule or 
orocems, either m the cytoplasm or m the nucleus 
; h ~~ b v inhibiting cr potentiating transcript iona 



activity . 



Such assay would be useful 
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development of drugs :ha: would be specific a=a:ns: 
particular cellular activity, or that would pc:er:::a:e 
such activity, ir. time or in level of activity. 

c This invention provides a method cf screening rcr a 

KSKV -selective antiviral drug in vi vo comprising: (a; 
expression cf KSHV DHFR or KSHV TS in a bacterial 
auxctrcoh (nutritional mutant) ; (b) measuring 
bacterial growth rate in the absence and presence cf 
10 the drug; and (c) comparing the rates so measured se- 

as to identify the drug that inhibits KSKV DHFR cr 
KSHV TS in vivo. 

Methods well known to those skilled in the art allow 
15 ■ selection or production of a suitable oacterial 
auxctroph and measurement of bacterial growth. 

The following reviews of antifoiate compounds are 
provided tc more fully describe the state cf the art, 

20 particularly as it pertains to inhibitors cf 

dihydrofolate reductase and thymidyiate synthase: (a) 
Unaer, 1996, Current concepts of treatment in medica- 
oncology: new anticancer drugs, Journal cf Cancer 
Research & Clinical Oncology 122, 169-193; Cb) 

25 Jackson, 1995, Toxicity prediction from metabolic 

pathway modelling, Toxicology 102, 197-205; (c) 
Schultz, 1995, Newer antifolates in cancer therapy, 
Progress in Drug Research 44, 129-157; id) van der 
Wilt and Peters, 1994, New targets for pyrimidine 

30 antimetabolites in the treatment of solid tumours 1: 

Thymidyiate synthase, Pharm World Sci 15, 167; (e) 
Fleisher, 1993, Antifoiate analogs: mechanism or 
action, analytical methodology, and clinical efficacy, 
Therapeutic Drug Monizoring 15, 521-526; (f) Eggott et 

3E 1993, Antifolates in rheumatoid arthritis: a 

hypothetical mechanism of action, Clinical & 
Experimental Rheumatology 11 Suppl 8, S101-S105; (g) 
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Kusnr.ejcer.5 e: ai . , . .-i.l-- 

7 r> - m — -i c>na- Sri sr. ze - 

Virajnincic^y Spec No, 52-57; in- Fleming - s -- ■ 

1992, Ar.- if slaves: -he next gsr.sra: icr. . ^err.ir.ais 
= Or.cclojy 19 , 7C7-719; and iil Bertmo er a_ . , - r - - 

Enzymes of -he thymidyiate cycle as careers tcr 
chemotherapeutic agents: mecnamsms of res-s.ar.ee, 
Mount Sinai Journal cf Medicine 5r, 291-355. 

ir . -his invention provides a method cf determine tr.e 

health cf a subject with AIDS comprising : ,a 

measuring the pi 
cr V KIS-IH; and (b) comparing the measure: va.ue tc 
a standard curve relating AIDS c.ir.ica: ccul " 
measured value so as to determine tne health cf the 



2C 



)lasma concentration of vKIF-I, \'K1?--Z 



sub]ect 



-j- — — — n--y-^, TTi'B.Z'.'Z of HIV 

This invention provides a methoc or inr.-O---^ 



witn an effective 



replication, comprising acr.ir.ister mg -~ 
cr creating cells of a subject 
amount of a polype? 
acid molecule, so as 



r= i — ■ 



eotide which is encoded by a nucleic 
to inhibit replication cf KIv. 
mbodiment, the polypeptide is one rrom the 



Ir. one e 
list provided in Table 1. 



. . • -.,^v, e - illustrated m tne 

3 0 Tnis inver.--^-- is 

_ - rw=i - = - t c c»» — -ens which follow. These 

sections are set forth :» aid in understand^ :r. 



invention but is not intended to, and snouxc not s> 
construed to, iiir.it in any way the invention as se 
forth in the claims which follow thereafter. 



PYPEBTMEWTM. DETAIT.S SECTION 



WO 97/27208 PCT/L'S97/01442 

8 = 

NUCL.E OT I EE SEQUENCE OF THE KAPOS I ' S S ARCOMA - AS £ C 1 1 ATE I 
HERPESVIRUS 

The genome of the Kaposi's sarcc-a-assc::a:e; 
5 herpesvirus (KSHV or HHV8 ) was mapped with cosr.id and 

chaae genomic libraries from the BC-1 cell line. Its 
nucleoside sequence was determined except for a 3 kb 
recicn at the right end cf the genome that was 
refractory to cloning. The BC-1 KSHV genome consists 

1C c .f a 140.5 kb long unique coding region ILUR) flanked 

by multiple G-C rich 8C1 bp terminal repeat sequences. 
A aenomic duplication that apparently arose :r. the 
parental tumor is present in this cell culture - derived 
strain- At least 81 open reading frames ;ORFs> , 

15 including 66 with similarity to herpesvirus saimiri 

ORFs, and 5 internal repeat regions are presem m the 
LUR . The virus encodes genes similar to 

complement -binding proteins, three cytokines (two 
macrophage inflammatory proteins and inn er leukin- 6 } , 

2C dihydrofolate reductase, bci-2, interferon regulatory 

factor, IL-8 receptor, NCAM- like adhesin, and a D-type 
eye I in, as well as viral structural and metabolic 
uroteins. Terminal repeat analysis of virus DNA from 
a KS lesion suggests a monoclonal expansion cf KSHV in 

25 the KS tumor. The complete genome sequence is set 

forth in Genbank Accession Numbers U75SS8 CUR) , 
U75699 (TR; and U7570C CTR) . 

Kaposi's sarcoma is a vascular tumor of . mixed cellular 
3C composition (Tappero er al . , 1992, J. Am. Acad. 

Dermatol. 2S, 371-395). The histology and relatively 
benign course in persons without severe 
immunosuppression has led to suggestions that KS tumor 
cell proliferation is cytokine induced (Ensoii er al . , 
35 1992," Immunol. Rev. 127, 147-155). Epidemiologic 

studies indicate the tumor is under strict immunologic 
control and is likely to be caused by a sexually 
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transmitted mfecticus agent other than KIV ;?eter~an 
ez ai., 1992, AZSS 7, eOS-elLi. KS-aSs;::a:e = 

herpesvirus (KSKV) was discovered in an 

bv " representational difference analysis ■'."■A anc 
shown "to be present in almost all AIDS-KS lesions 

i Chans e: ai . , 1994, Science 2o = ; xec. _ = 

findings have been confirmed and extended tc nearly 
I'n -<S "lesions examined fro- -he various epidemiologic 
classes of KS (Boshoff s: ai. . 199=, Lancer 345, 
1043-1044; Dupin et ai . , 1995, Lancer 345, 7£l-762; 
Moore and Chang, 1995, New Eng. ~. Wed- 222. 
-1S---1S5; Schalling er ai . . 1995. Nature Med. 1, 
7.--705; Chang er ai . . 1996, Arch. In:. Wed. 1 = 6, 
2C2-2C4). KSHV is the eighth presumed r.umar. 

herpesvirus (KKV8 i identified to date. 

The virus was initially identifed froa two herpesvirus 
^NA fragments. KS230Barr, and KS63I33T (Chang er ai . , 
1994, Science 265. 1865-1869;. Subsequent sequencing 
cf"a 21 k.b AlOS-KS genomic library fragment (KS5) 
-'br — izmg to KS23C3am demonstrated that KSHV is a 
^herpesvirus related to herpesvirus saimiri (HVS) 
b=icn=inc to the genus Rhadmovirus (Moore er ai . . 
-=o 6 J. Virol. 70, 549-558). Colinear similarity 
(svntenvi of genes in this region is maintained 
.between KSHV and HVS, as well as Epstein-Barr virus 

JEBV) and ecru me herpesvirus - " ~ 

. „„ , v _.- sinnq the KS=31Ba:r. sequence 
(L54 ana ^uL-i) uO..>.oiiu.^ 

, - - ^ anr ? TL-6Ra aenes unique to 

induces eye ^ in l. ana o-^ 

. rhadinoviruses . 

KSHV is not readily transmitted to ■ uninfected cell 
-• < nes ' (Moore ec ai . , 1996, J. Virol . 70, 549- = = 6;, 
but it is present in a rare B cell primary effusion 
(body cavitv-based) lymphoma (PSD frequency 
associated with KS (Cesarean et ai . . 1995, New Eng. u. 
Med. 332. 1186-1191). BC-1 is a PEL cell -ir.e 
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ccr.caininc: a high KSHV genome copy number and is 
ccinfected with E5V (Cesarman et al . , 1995, Blood 5c, 
2703-2714) . The KSHV genome form in BC-1 and its 
oarental tumor comigrates with 270 kb linear markers 
= or. cuised field gel electrophoresis (PFGE, (Moore er 

al . , 1996, J\ Virol. 70, 549-558)-. However, the 
aenome size based on encapsidated DNA from. an 
E3V-negative cell line (Renne er al . , 1995, Kazure 
Med. 2, 34 2-346) is estimated to be 165 kb (Mocre er 
- c a 2 . . .1995, J. Virol. 7C, 549-553). Estimates from KS 

lesions indicate a genome size larger than that c: E3V 
(172 kb) (Decker et al . , 1996, J. Exp. Med. 1B4, 
253 -288) . 

15 To determine the genomic sequence of KSHV and identify 

novel virus aenes , contiguous overlapping virus DNA 
inserts from 3C-1 genomic libraries were mapped. With 
the exception of a small, unclonable repeat region at 
its riant end, the genome was sequenced tc high 

20 redundancy allowing definition of tne viral genome 

structure and identification of genes that may play a 
role in KSHV- related pathogenesis. 



MATERIALS AND METHODS 

Library generation and screening. BC-1, HEL-6 and 
3CP-1 cells were maintained in RPM1 164 C with 20% 
fetal calf serum {Moore et ai . , 1996, J. Virol. 70, 
549-55B; Cesarman et al . , 1995, Bleed 86 , 2708-2714; 
Gao er al . , 1996, Nature Med. 2, 925-928;. DNA from 
3C-1 ceils was commercially cloned (Sambrook et al . . 
1989, Molecular Cloning: A laboratory-' manual, Cold 
Sonne Harbor Press, Salem, Mass.; into either Lambda 
FIX 1 1 or S-Cosl vectors (Stratacene, La Jolia, CA) . 
Phage and cosmid libraries were screened by standard 
methods (Benton et al . , 1977, Science 196, 180-182 ; 
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Kanahan and Msseiscr., ^95;, i 

223-342) - 



Initial library 



screer.ir-= was perrcrmec using 



nd KS€3l3am RDA fragments 1-nang c = 



a — 



K5 3 3 OB am an 

1994, Science 265, 1865-1865'. Overlapping cicr.es 

^ ■ ~ inpn^i - ^d usino 13 robes syntnesizeo 
were sequentially loen — <■* - i=>J -- - - 

- he ends of previously identified clones ; Figure 
-■■ !--r±)era and Vcgeistem, 1963, Anai . Sicche.m. 122 < 
I; Helton " e: al . , 1984, Wad. Acids J?es . 12. 
7025-7056) . The map was considered circularly 

by the presence of multiple, identica- T.-\ 
units in cosmids Z2 and 26. Each candidate phage or 
cosrr.id was confirmed by tertiary screening. 

<=■-->- srun ^-^r.cir.n and se j i^.rs verification 

Lambda and cosmid DMA was purified by standard metnods 
'Samcrook e: ai - , 1989, Secular Caning: « 
Taboratorv manual. Cold Spring Haroor Press, Salem, 
„ acc :. Shotgun sequencing (Deinmger, 196,, Anal. 
Pcochem. 129, 216-222; Banker ez ai . , 19E7. tfetn, 

_, , 55i si-52! was performed or. sonicates UNA. 

r'"-".T^b "f-aoUor. was subclones into M13mpI5 (New 
Viand Biolabs. Inc.. Beverly, MA; and propagated in 
XL-3-ue ceils (Stratagene. La Jolla, CA) (Samorook e: 
1969 Molecular Cloning: A laboratory manual. 
o-"d Sprina Harbor Press. Salem, Mass.! K12 phages 
we -- positively screened using insert DKA fro. the 
phage or cosrr.id. and negatively screened with vector 
arrr. DNA or adjacent genome inserts. 

Au-omated dideoxy cycle sequencing was performed with 
M13 (-21) =S* or F3 dye primer kits <Per«ir.--mer . 
3-archburo NJ> en ABI 373A or 2^7 sequenators CA», . 
Foster City. CA) . Approximately 300 V.13 sequences 
were typically required to achieve initial coverage 



a 
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for each 10 kb of ir.ser: sequence. Minimum sequence 
f ideiitv standards were defined as ccrr.t-ece 
bidirectional coverage with a: least 4 ever -acting 
sequences at any given site. For regions witr. 
5 seouence gaps, ambi entities cr frame shir ts tr.at cic net 

Tr.ee t these criteria, primer walking was done witr. 
custom primers ( Perkin- Elmer ; and dye terminator 
chemistry (FS or Ready Reaction kits, Per>:m - Elmer : . 
An ur.seguencsd 3 kb region adjacent to the right eno 
10 TE sequence in the 22 cosmid insert could not be 

cloned into M13 cr Eluescript despite repeated 
efforts . 
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Co nn en ce ass* 



>mblv and open re a din- frame ana-vsis 



Sequence data were edited using Factura >: AEI , Foster 
Citv, CA; and assembled into contiguous sequences 
using electropherograms with AutoAssembler (AEI , 
Foster City, CA; and into larger assemblies with 
AssemblyLIGN (IE I -Kodak, Rochester NY ; • Ease 
positions not clearly resolved by multiple sequencing 
attempts (less than 1C bases in total: were assigned 
the majority base pair designation. The entire 
seauence (in 1-5 kio fragments) and all predict eo open 
reading frames (ORFs) were analyzed using 3LASTX, 
BLAST? and ELASTN {Altschui e: al . , 199C, J. Mcl - 
S^'c 1 . 215, 403-410). The sequence was turtner 
analyzed using MOTIFS (Mocre et al . , 199£, J. Virol. 
70, 549-556) , REPEAT and 3ESTFIT (GCG ) , and MacVector 
{ IEI , New Haven, CT) . 



GRF assignment ana nomenclature 



All ORFs with similarities to HVS were identified. 
These and other potential CRFs having >10C amino acids 
were found using MacVector. ORFs not similar to HVS 
ORFs were included in the map (Fig. 1) based on 
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simiiantv to ether knowr. genes, eptimum :r.;::s::::". 
codor. ccr.iex: , fw.QK , 3 • ' 

BI25-8148;, size and positicn. :cr.ssrvs::vr 
selections were made to minimize sp-r.ous assignments; 
chis underestimates the number of true reading rrames. 
KSKV ORF nomenclature is based on HVS similarities; 
KSHV ORFs not similar to HVS genes are numbered in 
consecutive order with a K prefix. ORFs with sequence 
out net positional similarity to HVS ORFs were 
assianed the HVS ORF number te.g., ORF 2-. As new 
ORFs" are identified, it is suggested that they be 
designated by decimal notation- The standard map 
orientation (Fig. 1) of the KSKV genome is the same as 
for HVS (Aibrecnt en al . , ' 1991, J. Virol. 6€ . 

al., 1555, j. MoZ. 



5047-505°) and EKV2 (Telford et 
Bici . . - 

standard map (Baer e= al - , 1984, Nazure 21C, 207-211; . 



245, 520-526;, and reversed relative tc tne tav 



RESULTS 

^p.noT.ic maooincT and s equence characr er i s 1 1 cs 

Complete genome mapping was achieved with 7 lambda and 
2 ecsmid clones (Fig. 1 ) • The structure of the 3C-1 

. - *- ~ t--\;c -i t- - ^ v - - o a l one umaue 
KSHV aenome is sim^.ar t_> Hv- -r. - 

region <LUR> flanked by TR units. The -140.5 kb L.UR 
seauence has 53.5% G+C content ano includes all 
identified KSHV ORFs. TR regions consist cf multiple 
SCI bp direct repeat units having 54.5% C-Z content 
(Fig- 2A) with potential packaging and cleavage sites. 

,.-^; aT .;nr=; s*-e e^esent amone repeat 
Minor sequence Vor.atior.s <=.-e t-- 

units. The first TR unit at the left (26) TR junction 
(205b?) is deleted and truncated m 3C-1 compared to 
the prototypical TR unit. 



The aenome sequence abutting tne ngnt ter..;-..« rep 
recicn is 



- ~ -a r" 0 ^' o"~ in the Z2 

incomoiete Que to a 3 1 -~ — ~ il 
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ccsmid inser: chat could not be cloned into ssq-5r.::r.= 
vectors. Farcial sequence information :rx. primer 
walking indicates chat this region contains stretrr.es 
cf 16 cr A-rG rich imperfect direct repeats 
5 interspersed with at least cne stretcn cr 1 •= cp T-T 

rtch imperfect direct repeats.. These may tor-: a 
larger inverted repeat that could have contributed to 
cur difficulty in subcloning this region. Greater 
than 12 -fold average sequence redundancy was sent eve c 
1 r\ r v. ^ c^t-ire L 7 JR with comolete m ci re ct i cr.a. 

coverage by at least 4 overlapping reads except in cne 
unclonable region . 

The BC-1 TR region was examined by Southern ejecting 

25 since sequencing of the entire region is net possible 

due to its repeat structure. 3C-1, BCF-1 {an 
EBV-negacive , KSHV infected cell line) and K3 lesion 
DNAs have an intense -800 bp signal consistent wttn 
the unit length repeat sequence when digestec wttn 

2j enzyrr.es that cut once in the. TR and hybridized tc a TR 

probe {Rigs. 2B and 2CJ . Digestion with enzymes that 
do net cut in the TR indicates that the 3C-L strain 
contains a unique region buried in the TR, flankeo by 
-7 kb and -25 kb TR sequences (Figs. 2C and 2D; . An 

25 identical pattern occurs in H3L-6, a cei_ ---- 

independently derived from the same tumor as BC-1, 
suggesting that this duplication was present in the 
parental tumor (rigs. 2C and 2D). The restriction 
pattern with Not I , which also cuts only once within 

3 0 the TR but rarely within the LUR., suggests that tne 

buried region is at least 33 kb . Partial sequencing 
of this region demonstrates that it is a precise 
genomic duplication of the region beginning at ORF K6 . 
The LUR is 14 0 kb including the right end unsequenced 

35 gap <<3kb>. The estimated KSHV genomic size in 3C-1 

and HBL-6 (including the duplicated region) is 
accroximately 210 kb . 
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r 5 

Eased on ".he ESV replication model used ir. duality 
s-udies (Raab-Traub ar.d Fiynr. , 15E5. ' 
SS3-369), the polymorphic EC? - - laddering pattern ...ay 
-eflect lv:ic virus replication ar.d super in: 5 - - icr. 
"fIc" 2C; .' Che SBV laddering pattern, occurs wher. Tr. 



2 0 



35 



uru: 



are deleted or duplicated during lytic 



reciicacion ar.d is a Etochas-.i: process for each 
infected cell (Raab-Traub and Flyr.r., 1956, Cell 47, 
es'3-SSS). N= laddering is present for 3C-1 whicr. is 
• •jjna*' ticht latent KSHV replication centre! (Kccre e: 
I- . 1996. J. Virol. 70, 549-555). K= lesion DNA 
also shows a single hybridizing band suggesting that 
virus in KS turner cells rr.ay be of ncnocicnal crigm. 



- n 



w -cd-nn -esicr.s -h° K- c HV 

The KSKV genaT.* shares the 7 bloc. (3) organization 
.-- =- c i = li of other herpesviruses i Chee e: a.., 

, ... .-:.-cei 
- =5C , , curr. Topics Kicrcbic-. -.tciuh '- 

with sub- cattily specific cr unique C-RFs present 
■..-w-j bise>.s ; interblock regions i IB i a-n. rig. I). 
c .= = a-alvsis indicates that only 79% of the secuencec 
•- 37 . 5 hb X.U?. encodes 81 identifiable 0R7s which is 
-<k-"v to be due to a conservative assignment ct 0*r 
2-- positions. The overall UK CpG dmucleotiae 

"observed/exoecced (0/E) ratio is 0.75 consistent witn 
a m~d»rate loss of methylated cytosines, cut tnere is 
marXed regional variation. The lowest CpG o/E ratios 
,-<0.S7) occur m IBa (bp 1-3200;, m *= 
30 (68,602-69,405:. and I3h (117 . 352-137 . =07) . Tne 

•^--o = - O/E ratios 00.885 extend fror. 52 to ~ 
■','c 7C-.-",B491, in IBe (67,301-63,60:-:. and in B6 
■77'251-63.600). Comparison to the KS3 sequence 
i^re e: ai . , 1996, J. Virol. 70. 549- 358) shows a 
hic - s-ousr.ee conservation between these two strains 
„<-•- o-iy 21 ooint mutations over the comparable 20.7 

,r -9-1 a f-amesrift within 2C-1 OF..- 26 
k;b region (C.i-s). A t.amesr.i.- 
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( cos it i or. 4 5, C 04 } compared to KS5 OF.F 2 = was net 
resolvable despite repeated sequencing cf KSE an: 
products amplified from BC-1. Two acciticr.a- 

frameshifts in noncoding regions ;bp 4~,S£2 ana 
5 45,335) are also present compared :c the KS5 secruer.ee. 



Several repeat regions are present m the Lu.- (rig. 
1) A 14 3 bp sequence is repeated within OR? Kll at 
positions 92,576-52,820 and 52,652-52,554 ( waka - j w>a 1 . 

10 Complex repeats are present m other regions cf tne 

genome: 20 and 30 bp repeats in the region from 
24,285-24,902 (f rnk) , a 13 bp repeat between bases 
29,775 and 29,942 (vnct), two separate 23 bp repeat 
stretches between bases 215,123 and 113,6 97 ; eppa ) , 

15 and 15 different 11-15 bp repeats throughout the 

region from 124,527 to 126,276 (moii. A complex A-G 
rich repeat region imdsk) begins at 13 7,055 anc 
extends into the unsequenced gap. 

2 0 Conserved OR?s .with similar genes found in other 

herpesviruses are listed in Table 1, along with their 
cclanty, map positions, sizes, relatedness to HVS and 
EBV ORFs, and putative functions. Conserved ORFs 
coding for viral structural proteins and enzyrr.es 

25 include genes involved m viral DNA replication (e.g., 

DNA polymerase (ORF 9);, nucleotide synthesis (e.g., 
dihydrof oiate reductase < DHFR , ORF 2), thymidylate 
synthase (TS, ORF 70)), regulators of gene expression 
(R transactivator { LCT? , ORF 5 0) } and 5 conserved 

30 herpesvirus structural capsid and 5 glycoprotein 

cenes . 

Several genes that are similar to KVS ORFs also have 
unique features. ORF 45 has sequence similarity to 
2 5 nuclear and transcription factors (chick nucieolin and 

yeast SIR3 ) and has an extended acidic domain typical 
for transactivator proteins between amino acids 90 and 
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ORF7 2 also has a- expended aciaic c 

lwo recions by a g.uiar.ir.r- 



separated -n:: 

sequence encoded by the mci repeat, 
ccnsis 



r S almost exclusively cf asparci- ar.c r _w-o... 



acid residue repeats while the second giuca...i~ acto 
rich region has a repeated leucine heptad 



laaestive of a leucine zipper structure. ur.r .: 

protein, has a high leve. 



suc_ _ 

mutative tegument 
similarity tc the purine biosynthetic enzyme c: ^- 
coli and r . melanogascer N- r crmylgiycmamide rioccioe 
aminotransferase -FGARAT) . 



;u: arc 



ORFs K3 and :<5 are not similar tc HVS genes 

^-milar tc the major immediate eariy 

herpesvirus type 4 ( BKV4 ) gene ZEZ Jl= ^ and 13% 

identity respectively) (van Santen, 1921, vnc 

5-1-1-5224;. These genes have no signincant 

• rv, ffl "^erpes simplex virus - ' " ~ ■ 
similar_~y . ---- r 

; 3HV4 1=11 : , but enccce proteins 



(which is similar 
2C sharing with the HSV1 ICPC protein a 

;h- -h mav form a zinc finger mctii r.-s.. — = - 



-cron which may form a zinc finger met it '-an — 

=994' t rc crotem er.coceo 
1951, o. Virol - 3 2----^4,. ~ 

bv OR? K= has a region similar — — - n _~ _ 
localization site present ir. tne .«te r ^ !r '/- 
2* orotein. OF.r KS has a purine binding moti: ,„_-v.«».»; 

t he c-cerrainus of -he prozein which is si,uar wO 
a motif present m the KSHV TK :ORFli: (Moore e:a:., 
1996, J. Virci . -70, 545-556). 



No KSKV aenes with similarity to .ks J " rE ' • ' 
12, 14, IS. 51 and 71 were identified ir. the KSKV La.-; 
s~q->ence. HVS ORF 1 codes fcr a transforming protein., 
"esoonsible for HVS- induced ir. vitro lymphocyte 
trarsf ormation (AKari ec a_ . , 

362-388) and has poor sequence conservation among 
strains (Jung and Desrosiers. IrSl, V. Virci . ==. 

■ ^ -~r- , iqoc Molec. Cellular 
6S53-695C ; Jung ana Desrosiers, 



08 
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Biol. 15, 5506-5512). Functional ?*SKV ger.es simi_ai 
:o this gene may be present bur were not identifiable 
by sequence comparison. likewise, nc KSKY genes 
similar to EBV latency and transf crmat i cn - ass: ciat ec 
proteins (E3NA-1, EBNA-2, EENA-L? , LM?- 1 , ■ L*-'? - 2 cr 
an 3 5 0/220; were found despite some similarity :: 
reoeat sequences present in these genes. KSHV also 
does not have a gene similar to the 3ZLF1 EBV 
t ransactivator gene. 

Several sequences were not given ORE assignments 
although they have characteristics of expressed genes. 
The sequence between bp 9C , 173 and 90,64 3 is similar 
zo the precursor of secreted glycoprotein >: (gX- , 
encoded by a number of aiphaherpesviruses 
(oseudorabies , EHV1 ; , and which does not form part ot 
the virion structure. Like the cognate gene in EKV1 , 
tne KSKV form, lacks the highly-acidic carboxy terminus 
of the pseucorabies gene . 

Two coivadenylated transcripts expressed at hign copy 
number m 3C3L-1 are present at positions 
2S, 661-2?, 741 (71. I) m IBb and 118,130-117,436 (TO. 7) 
in 13h. TO . 7 encodes a 6C residue poiypeptiae ( OE.F 
K12, also called Kaposin) and Tl . 1 (also referred to 
as nut-i'* has been speculated to oe a " RKA-li.<e 
transcript . 



Cel. cv 



;aulation and cell si ana line: proteins 



A number of ORFs which are either unique to KSHV cr 
shared only with other gammaherpesviruses encode genes 
similar to oncoproteins and ceil signaling proteins. 
ORE 16, similar to E3V SKRrl and HVS ORF16 , encodes a 
functional Bci-2-iike protein which can inhibit 
Bax-mediated apoptosis . ORF 72 encodes a functional 
eveiin D gene, also found in HVS (Nicholas et ai . , 
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19=2, iVar-re 255, 3£2-3£r., tns: r - n 5 ~" ~ ~ ~ 7 _ ~ I~ 
humar. cyciir. Z in phcsphcrylat tng :he r5::r.:c.s£-:^ 
:uTrr suppressor protein. 

KSKV encodes a f unccicnaily-acc ive II - ~ * c ' r --" 
:w: macrophage inflammatory proteins w-;-r^: ■ Jr ~ z ~ 
l n ~ KB) ' which are not found in other human 
herpesviruses. The vIL-5 has 52% ammo acid 

s - Hilarity to the human IL-6 and can substitu_- - c- 
h'-ma*" '1- c in preventing mouse myeloma eel- apoptosis. 
Botn MI?- like proteins have conserved Z-Z d-mer 



si 



a:ur S s characteristic of 3-chemokiaes and 



sequence identity tc human MI?- la ir. their K-ter...inus 

,._,„., vMip-: (C?.F K€ ) can' .inhibit CCR-r aeper.zer.. 

: : :^::"" reF .ii sa . ioR . An. open reading frar* scar.nin= 

~ ^ - ^ c - " i ' c 5 ( vM I - - - I - - has 
^uciectioe nunoers iCy< - *- - — - 

_ . ^„ , ^-■s i£ "=LAS7X oc-isscn p= 0.0015' 
low conservation w r.-- ±v 

bu - retains the C-C dimer motif- OFF Kr '^r:; 
encodes a 44 9 residue protein with similarity tc tne 
family of interferon regulatory factors ,_r.r: 

^ „ ^- -/ic_--"t > Tr has 12 . 4 o ammo 

1955, PhariTiac. Tner. cd, --^ - 

■ - -- se cue nee 

a--^ -dentitv to r.uman m-ei-s--n »_r.s-^ 

* ' . - ,~_-~v- — r -he IRF 

binding protein ana partis. * 

DNA-hindmg domain. Three additions, open r^-^- 

frames at bp 3 5,910- 86,410 iv.r,.-. . -~ 

( v IRF 3 ) and bp 94 , 127 - 53 , 63 c ivIR?4, also nave low 
similarity to IRF -like proteins •? > 0.25;. No 
conserved interferon consensus sequences were found m 
this region of the genome. 



- ~ t ides, 

nd m other herpesviruses, include 



O 4- her aenes encoding sig 
which are also foun 
a complement -binding protein ,v-CFF, OFF 4;, a n-=ui^ 

c »-- adhesion molecuie v »wAM, -x.k- 

J 4 ) and an ~3 receptor (ORF 74). Genes similar tc 
«*. 4 and 74 are present in other rhadincviru... ana 

OR r 4 is similar to variola B19L and D12L proteins. 
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Cf.r K14 (v-adh) is similar to the ra: and r.u...ar. 
nerirane antigens, various NCAKs and the ?cl::v:ru = 
receptor -related protein PRR1 . GX-2 is m turn 
similar to ORF UE5 of human herpesviruses 6 and ~ cut 
-here is no significant similarity between the K5HV 
and betaherpesvirus OX-2/NCAK ORF s . Like other 

immunoqlcbulin family adhesion proteins, v-adh has 
V- like , G-Iike, transmembrane and cytoplasmi c domains, 
and an RGD binding site for fibronectm at residues 
26 5-270. The vlL- 8R has a seven transmembrane 
scanning domain structure characteristic of G-prctem 
ccucled chemoattractant receptors which includes the 
Z5V- induced EB 1 1 protein (Birkenbach e t al . , 1993, J . 
Virol- 57, 2209-2220). 

DISCUSSION 

The full-length sequence cf the KSHV genome in SC-1 
ceils provides the opportunity to investigate 
molecular mechanisms cf KSHV- associated pathogenesis. 
The KSHV aenome has standard features ot mac mo virus 
genomes including a single unique coding region 
flanked by high G-C terminal repeat regions which are 
the cresumed" sites for genome circuiarizat ion . In 
addition tc having '66 conserved herpesvirus genes 
involved in herpesvirus replication and structure, 
KSHV is unique in encoding a number cf proteins 
mimicking cell cycle regulatory and signaling 
crcteins . 



Cur estimated size cf the 3C-I derived genome (210 kb 
including the duplicated portion) is consistent with 
that found using encapsidated virion DNA (Zheng e: 
al., 1996, Proc. Nazi. Acad. Sci . USA 93, 6641-6646). 
Genomic rearrangements are common in cultured 
herpesviruses (Baer et al . , 1984, Nature 310, 207-211; 
Cha'etal., 1996, J. Virol . 70, 78-83). However, the 
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aenomic duplication present i~ 

did r.o: ar:se during tissue culture passage, 
hvhndization studies indicate that tr.is insei 
a duplicated LUR fragment into the B2-1 ---- is 
Dresen: in KSKV froT, the independently derive- 



(Gaidant s: al . , 19?-=. 



There i; less 
Hicrher 



^22*7-40) . 

Descite this genomic rearrangement, the KSHV ge 
well conserved within coding regions. There 
than 0.1% base pair variation between the Br-i 
21 >:b K£5 fragment isolated from a KS lesion . 
levels of variation may oe present m strains trorr. 
other aeographic regions or other disease conditions. 
Within the 7- UK, * synteny to KVS is lost at ORrs 2 ano 
but there is concordance ~n ai- - 

a q« v «^a"' conserved genes, sucn as 
conservec wi_n r.v^ . — o_ 

-nvmidine hmase (TK) (Cesarman ec «- - , --jJc- 

36 270S-2714!, TS and 3KFR (which is present i^hv'S, 

■ ~- a i -, c = ~ t Virol- c:l d C » " - 5 0 5 8 , 
= ^=e A-Drecnt ai • » - - - - - 

but not hu:nan herpesviruses!, encode proteins that are 
appropriate targets tor existing drugs. 

Molecular mimicry by *.sr.* c_i- -.• 

signaling ' proteins is a prominent feature or tne 
virus. The KSHV genome has genes similar to celluiar 
complement -binding proteins (ORF4i, cytokines !ORFs 
K2, K4 and KS > . a bcl-2 protein (OR? 1€S , a cytokine 
transduction pathway protein (KS! . ar. IL-8R- jlik. 
orotem (ORF74) and a E-cype cyciin (OF.F-2;. 

■ _^-- = ;.-<= with some 
Additional regions coo-n- -■— 

similarity to MI? and IRF-liSe* proteins are a.so 
orient in the KSKV genome. There is a striding 
parallel between the KSKV genes that are similar to 
"c^lu-.ar aenes and the cellular genes known to =e 
induced by E3V infection. Cellular eye 1 in D , 

CD21/CR2, bcl-2, an IL- 6R- li*= 
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and adhesion molecules are upregu^atec ry • 
infection (Eirkenbach e: ai . , 2953, - l ' :r: - ■ V' 
22C5-222C; Palmer c er al . , - r , 

1049-10 54 ; Fmke er ai . , 15 52, Bloc-" 4 5 9 - 4 £ r 

=. Finke er al . , 19 94, Leukemia & Lymphoma 21, 4 1 1 - 4 1 r ; 

Jones er ai . , 1955, J. Exper . Med. 1S2, 2222- 1222- . 
This suggests that KSKV modifies the same signaling 
and reaularicn pathways that E3V modifies after 
infection, but does so by introducing exogenous genes 
1C from its own genome. 

Cellular defense against virus infection ccmmoniy 
involves cell cycle shutdown, apoptosis {for review, 
see Shen and Shenk, 1995, Curr. Opin. Gene:. Devei . 5, 

1= 205-121' and elaboration of ceil -mediated immunity 

(CMI ■ . The KSHV-enccded v-bcl-2, v-cycim and V-IE--6 
are active in preventing either apoptosis or cell 
cvcle shutdown (Chang er al . , 1995, Nature 3E2, 410;. 
Ai least one of the £- chemokine KSKV gene products, 

2 0 v-KIF-I, orevents CCE5 -mediated HIV ir.recticr. of 

t^ansfected cells. 3 - chemokmes are nor known to be 
reouired for successful E3V infection or ce_ls 
a" -houan E3V- infected B cells express higher levels or 
KIP- la than normal tonsillar lymphocytes i Harris er 

25 al ^ f 1993, iei, 5575-59B2; . The autocrine dependence 

of E5V- infected E ceils on smal 1 and uncnaract eri zeo 
orctem factors m addition to IL-6 (Tosato er al . , 
.1990, J. Virol . 64, 3033-3041) leads to speculation 
that £~ chexokines may also play a role m the EEV life 

3C cycle. 

KSKV has nor formally been shown to be a transferring 
virus and genes similar to the major rransrcrmmg 
aenes of KVS and E3V are not present in the BC-i 
35 strain KSHV. Nonetheless, dysreguiat ion of cell 

proliferation control caused by the identified 
KSHV-encoded proto-oncogenes and cytokines may 
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cells. 

K5KV fraamen:s car. :rar.sfcm NIK 2Z2 ce_i=. 
replication, like tha: cf E3V , involves rec:-:r. 
of" TR u-::s ;Raab-Traub and Fiynn, ! = "el. 
S E 3 - B 8 9 ) , a nonoxorphi: TR hybr id:za:^cn pa 
cresent in a K5 lesion would indicate a clonal 
population in the rumor, 
be in- a 

single transferred, KS- infected cell rather cnan 
bem= a "passenger virus". Identification zz 
genel similar to known oncoproteins and 
orolif eraticn factors in the current study pr~ 
evidence that KSHV is likely to oe a transit: 
virus . 



Preliminary studies suggest tnat sur-... 



-rue neoplastic proliferation arii=_^ r ■■ 
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FIXPERIMENTA T. rraTMLS SECTION II: 

MOLECULAR MIMICRY OF HUKAIC CYTOKINE Al" CYTOKINE 
RESPONSE PATHWAY GENES 3Y KSHV 

-cur virus genes encoding proceir.s similar to two 
human macrophage inflammatory protein :KI?" 
chetr.okir.es, an XL- 6 ana an interferon regulatory 
factor (IRE cr ICS3P) polypeptide are present in the 

10 genome of Kaposi's sarcoma - associated herpesvirus 

(KSHV) . Expression of these genes is inducible m 
infected cell lines by phorbol esters. vIL-c is 
functionally active in B9 cell proliferation assays. 
I- is primarily expressed in KSHV - infected 

1= hematopoietic cells rather than KS lesions. vMIP-I 

inhibl r S replication cf CCRS - dependent HIV-1 strains 
i-. vitro indicating that it is functional and could 
contribute tc interactions between these two viruses. 
Mimicry cf cell signaling proteins by KSHV may 

2C abrogate host cell defenses and contribute to 

KSHV - associated neoplasia. 



Kaposi's sarcoma-associated herpesvirus ■ KSHV ) is a 
25 gammaherpesvirus related tc Epstem-5arr virus ( EBV ; 

and herpesvirus saimiri <KVS) . It is present in 
nearly all KS lesions including the various types of 
HIV-reiated and HIV-unreiated KS (Chang et aJ., 1994, 
Science 255, 13£5-18£9; Boshcff et al . , 19=5, Lancet 
30 345, 1045 -1044; Dupin en al . , 1995, Lancez ^4^, 

7 = 1-762; Schalling e: al . , 1995, Nature Med. I. 
707-708) - Viral DNA preferentially localizes to KS 
turners (Boshcff et al . , 1£95, Mature Med. 1, 
1274-1273) and serologic studies show that KSHV is 
35 specifically associated with KS . Reiatec 

lymphoproliferative disorders frequently occurring in 
oatients with KS , such as primary effusion lymphomas 
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{ ?EI_ ) , a rare 3 cell i^phoxa, arc some rcrms^cr 
-as-leman's disease are als: associated wi- r. . K = .-.-. 

st al. , ::- = :, wev s."- — 



infection I Ce sarins 
3 3 2, 



1156-11?1; Soulier e: al - , -rS?. , --"o = t, 
1276-1280) - Three KSHV-encoded cy:-:--l:.e 

oolvpenides and a polypeptide similar :c :r.:er:er:r. 
regulatory factor genes have now been identified. 
Paradoxically, while cytokine dysreguiat icn has been 
crcMS ed to cause Kaposi's sarcoma (Enscli e: al . , 

-_co2 Cancer 



/.'scare 271, 674-680,- Miles, 



- ssa ne.i: & -Research 63, 125-140). in vi =rc spindle 
cell lines used for these studies over the past oecaae 
a-e uniformly uninfected with KSKV ' Ambroziak sz al . , 
Science 266, 3S2-5e3; Lebbe et al . . 1 5 5- ■ ^-" 5: 34 =- 



^d=nti:v ur.iaue genes ir. the K3KV ' genoxs . genorr.ic 
seauencinc' ise* METHODS! was performed using 
Sucerccs-1 and Lambda FIX II genomic libraries frorr. 

- ^ _ _ „ 0 v i stabiv 

- ( a ncnHcagr.ir. s lympn^u-a ■ 

•Ifllco with both KSHV and F.5V tcesarmar. e: al., 

,055, si sod 66. 270 8-2714;. The KSHV DKA fragments 

KS3 3 0Bax and KS631Bam (Chang et al . . 19S4 . science 

,; 5 1865-13S5! were used as hybridization starting 

~S-l-~ s --or marking and bi - direct icnai sequencing. 

Oc- readir.c frame (OR?) analysis .see METHODS ) of the 

ZS cosmid sequence identified two separate cooing 

reaions (ORFs K4 and K6 ) witn sequence similarity to 

c-chemokines and a third coding region (GRr K2 ^ 

■s- s^-.ilar to human mterleukin- 6 (huIL-6); a rourtn 

coding region (ORF K9) is present m the ZS cosmid 

insert secruer.ee with sequence similarity to mterrerc. 

r==uiatcrv factor (IRF) polypeptides (Figures 3A-3C, . 

None cf these KSHV genes are similar to other known 

viral genes. Parenthetically, - w — 



■ ^ ■ - _ ^ - ^ /" _ ^nft-iokine motif 
c-nse-ved cysteine notirs s-ir.ua- — ^ 

signatures has recently been reported in the moliuscum 
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cor.tagicsuT. virus ( K CTV } genone . Neither vMI 
vKIF-II has significant similarity tc the MCV pr::e:r. 

The cellular counterparts to these four viral genes 
5 encode polypeptides involved in ceil responses tc 

infection. For example, the Ml ? / -macrophage 
inflammatory protein/regulated cn activation, normal 
T ceil expressed and secreted; family of 6 - 1 1 fcDa 
^-ohemoattraotant cytokines {chemokines play an 
10 important role in virus infection-mediated 

inflammation (Cook er al . , 1995, Science 269, 
1553-1565). £-chemokines are the natural ligand for 
CCR5 and can block entry of non- syncytium inducing 
(NS1), orimary lymphocyte and macrophage - tropic HIV-1 
strains m vitro by binding to this HIV oo-receptcr 
;Cocchi er al . , 1995, Science 270, 1611- 1S15;. IL-6, 
initially described by its effect on E ceil 
differentiation (Hirano et ai . , 1965, Prcc Nazi Acad 
Scz, USA 85, 549C; Kishimoto et ai., 1995, Bleed 86, 
1243-1254;, has pieiotropic effects cn a wide variety 
cf cells and may play a pathogenic role in multiple 
nveloma, multicentric Castieman's disease (a 
?-SHV-related disorder), AIDS-KS and Z3V-reiated 
postransplant lymphoprcli f erat ive disease (Klein et 
al., 1995, ' Blood S5 , S63- 372 ; Hilbert et al . , 1995, J 
Exp Med 112. 243-248; Brandt et al . , 199C Curr Topic 
Microbiol Immunol 166, 37-41; Leger et al . , 1991, 
Blood 75, 2923-2930; Burger et al . , 1994, Anna! 
Hemazol 59, 25-31; Tosato et al . , 1993., J Clm Invest 
91, 2606-2814). IL-6 production is induced by either 
E3V or CKV infection and is an autocrine factor for 
E3V- infected lympnoblast oid cells that enhances their 
tumcragenicity in nude mice (Tosato et al . , 199C, J 
Virol ~64. 3033-3041; Scala en al . , 1990, J Exp Med 
172, 61-58; Almeida et al . , 1994, Blood S3, 370-376). 
Cell lines derived from KS lesions, although not 
,„x^- Lec 5 w:t h KSHV, also produce and respond to IL-6 
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, Miles er a:.. -SO, Prrc Naz2 Acad Sri V=A 
,065-4072; Y ar .g « a.., 1934, J J^uncl -='- ■ -~--= = 
While KIP and IL-6 are secreted cytokines, en* -r.r 
'axiiv of colypeptides regulars ir.ter:er:r.-:r^.:;.e 
aer.es in respsr.se to 7- or a- //?- interferon cytc.-.-r.e. 
bv binding t= specific interferon consensus sequences 
■ICS) within interferon- inducible promoter re r i=r.s. 
A broad array cf cellular responses to interferons is 
nodulated by the repressor or transactivat or functions 
of IRr polypeptides and several members 1 ar.c 

-^--2) have opposing anti -oncogenic and oncogenic 
--.ivities (Sharf ec al . , 1955. J Bid Thee: 27C, 
-3063-13069; Harada et al . . 1953, Science 255. 
-71-974; Weisc ec al . . 1994, Interna: Icrnu.cc- 
1125-1131; Weisz et al . . 159", J £-c_ _Vi9... -= ■ . 
25585-2559= ) . 



2B5 bo 0=r K€ iO?.F K.?C gene encodes a i:.5 k_>a 
-o-voestiae <vMI?-I; MI PI) having 37.5% attir.c acta 
identity !7l% similarity) to huMI?-ia ar.d s__gr.tiy 
•o„.- similarity to other £- chen-.ckines figure 3A ; . 
--- K4 also encodes a predicted ICS k=a pciypeptioe 
(vMI?-H; vK_?l_-:i) with close similarity and ammo 
acid hydrophobics-/ profile to vKlF-C The twc 
v_-.5v-en0_.ded Ml? /5-chexokines are separated ore- eacr. 
l-^ on tne KSKV genome by 5.5 ko of intervening 
Nuance containing at least 4 ORFs .see METHODS) . 
3o-n poivoeptides have conserved .-chemo^.ine nocus 
.rioure 3A, residues 17-55; which induce a 
characteristic C-C dicysteme dimer -igure 
. es :.„-s 36-37;, and have near sequence loentity to 
ruman MI?-1« at residues 56-64. However, tne twc 
-civ— ctides show only 45.0% ammo ac_c icentity to 
l ac ;" Khsr and are markedly divergent at tne 
....-..--ide level indicating that this duplication is 
-o- a ciomna artifact. The two viral pciypeptices 
a~ more closely related to each otner 
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huRANTES sug?es:ing tha: -hey arose ry ? er - r 
duplication rather tnan independent acquis it i ;r. rrcrr 
che host genome (see Sequence alignment in KETKI-S 
The reason for this double gene dosage m the viral 
crenome is unknown . 

Th« KSHV GRF K2 (Figure 3B) encodes a hyp en he t ica _ Z 0 4 
residue, 23.4 kDa IL-S-like polypeptide witr. a 
hydrophobic 19 amino acid secretory signaling peptioe 
having 24. B% amino acid identity and €2.2% similarity 
to the human polypeptide. vIL-6 also has a conserved 
sequence characteristic for IL-€-like mterleukms 
(amino acids 1C1-125 of the gapped polypeptide'; as 
well as conserved four cysteines which are present in 
1L-6 polypeptides i capped alignment residue positions 
72, 78, 101 and 111 m Figure 3Bi. is a 

glycosylated cytokine and potential N- linked 
glycosyiation sites in the vlL-6 sequence are present 
at gapped positions 96 and 107 in Figure 3C. Tne 449 
residue KSHV vIRF polypeptide encoded by ORF KI- has 
lower overall amino acid identity (approximately 13%) 
to its human cellular counterparts than either of the 
vKIPs cr the vIL-6 , but has a conserved region derived 
from the IF.F family of polypeptides (Figure 2C, gapped 
residues 86-121). This region includes the 

tryptophan- rich IRF ICS DNA binding domain although 
only two of four tryptophans thought to be involved in 
DNA binding are posi tionally conserved. It is 

preceded by an 57 -residue hyarcphilic N- terminus with 
little apparent IRF similarity. A low degree of ammo 
acid similarity is present at the C-termmus 
corresponding to the IRF ramiiy 

transact ivatcr / repressor region . 

The four KSHV cell signaling pathway genes show 
similar patterns of expression in virus - infected 
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N'E'HODS ) . Whole P^NA was ex:rac:e- :r: " s-r-_ 

i-^e infected with KSKV aior.e a "- ■ • 

c ~ - i^ect ed , see Cesarman e t a-., _ r - - . - — 

27-5-2714) witn cr without pretrea^er.: with 2C r.g-rr.. 
12 -O-tetradecanoylphcrbol- 13 -acetate -r«, ~ - ~ w ' 

Louis MD) for 48 hours. While constitutive expression 
cf these genes was variable between the two cell 



lines, 

increased ir. 3CP-1 and BC-1 oelis after TPA mauccicr. 
(Figures 4A-4D} - This pattern is consistent witn 
expression occurring primarily during lytic pnase 
v- -nas replication. Examination cf vira- termm^- 
reoeat sequences of 3CF-1 and BC-1 demonstrates that 
lew level of virus lytic replication occurs m B^--. 
but not BC-1 without T?A induction (see M^-HCCf; , ana 
both cell lines can be induced to express lytic ? h * ss 
~en~s ov T?A treatment despite repression cf SNA 
replication in 3C-I. Lower level latent expression is 
also likely, 

..—-n> t? (Ficure 4D) , since 

. . : , ^— -"-1 cells which 

detectable witnout ± f ~ -^du_i^- 

„ ^ -.- " ^ - nors'-xir.t it ir. 

— --*i^r- ^ - =>n r v oontioi- - 

3*^e unoer — & — u -) — ^ — 

■ ^, -, — ; ^ -'>ia=c cranes, DNA 

oartial virus sequences tna i-— — = - ^ 

was extracted froir. four KS spindle cell lines (.<S-2. 

K=-10. K5-12 and KS-22, ano r_r. «nc— ' 

vMIP-i:. vIL-6 and vIRF sequences isee METHODS; . None 
cf -he spindle cell DNA samples were positive for any 
cf the four genes. 

vlL-6 was exar.ir.ed m more detail using bicsssays and 
antibody localization studies to determine wnerner it 

, = -..„ ionally conserved. Recoxbinar.t vll-i ;rv.^-o; 

ts soeclTica'liy recognized by antioeptide antibodies 
w*-i<-h do not cross-react with huIL-6 (Figures SA-=25 
(s— METHODS i . vIL-6 is produced constitutiveiy in 



expression of al_ tour gene 



-i ^ „ ^- ,.;t _ ^ ! ^ i ~ur e 4 Z) ano 

1 v, particularly - - - - . — 
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^C?-I cells and increases markedly after 45 ncur ----- 
induction,, consistent with Northern hycr:d::a-:n 
experiments. The E.C-1 cell line ccmfected with octh 
KSHV and EEY only shows polypeptide e>:?ressi:r. 

- after TP A induction (Figure 5A, lanes 3-4; and centre - 

EEV- infected F3KRI cells are. negative for vIL-6 
nvpression (Figure SA, lanes 5-6). Multiple high 
molecular weight bands present after TFA induction 
( 2i-25 kDa- may represent precursor forms tne 

10 Dcivoecttde . Despite regions of sequence 

dissimilarity between huIL-6 and vIL-6, the virus 
mterleukm 6 has biologic activity m functional 
bioassays using the IL- 6 -dependent mouse plasmacytoma 
cell line Br (see METHODS ) . CCS" supernatants from 

15 the forward construct (rvIL-6; support BS- cell 

c-rclif eration measured by ^H-thymidine uptaKe 
indicating that vIL-6 can substitute for cellular IL.-S 
lt. preventing 39 apoptosis (Figure Si. v - - " 6 

sucoorted 25 proliferation is dose dependent with the 

20 unconcentrated supernatant from the experiment shown 

in Fiaure 6 having biologic activity equivalent to 
approximately 20 pg per mi huIL-6. 

Fortv- three percent of nenmduced SCF-1 ce__s :_-igure 
25 7A; ' have intracellular cytoplasmic vIL-6 

inmunostaining (see METHODS) suggestive of 
constitutive virus polypeptide expression in cultured 
infected ceils, whereas no specific immuncreact ive 
staining is present in uninfected control ?3HRi celis 
3: . (Figure 7B> . vIL-S production was rarely detected in 

YS tissues and only one of eight KS lesions examined 
showed clear, specific VIL.-6 inmunostaining m less 
than 2% of ceils (Figure 7C) . The specificity of this 
iow pcsitivity rate was confirmed using pre immune sera 
35 and neutralization with excess vIL-6 peptides. Rare 

vIL-6-producing cells in the KS lesion are positive 
for either CD34, an endothelial cell marker (rigure 
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c n a - b o t r. s * * — . — — — ■ 
:eils in KS lesicns produce vll^ 
-vjdco rare vIL-t pcsitiv— -~- 



1 ^ — C ' 



g p ar.-heaia = c? = i«-.i= =e._ rr.ar.^r 
5 3 ; derr.onsrrs'ir.c 

possible mat 

-ntermg ivtic phase repl.-a-on which r.as ^ 
:; c = cur using the KSKV Tl 1 lytic pnase r.KA prooe . 
t- : -:ras: ( well over half (€5%) of ascitic lymphoma 
ceils pelleted from an HIV-negacive ?EL are strmgly 
cclitive for (Figure 7E; and express me plasma 

ceil .marker EMA (Cesarman e: al . , l ? ? => ^ " £c ' 
2706-2714) indicating thai either most cells 
vivo are replicating a lyric form of KSKV or that 
"* re ^_ :y — fected ?E1 ceils can express high levels cf 
vI L-€. *Nc specific seaming occurred with any = ~ r -^°- 
::££ ue£ examined including normal skin. -- n£1 " ar 
-issue, multiple myeloma or angiosarcoma using eitner 

„^ -a-n - - -rT*.-- antibocy 

creimmune or post - immune rac an.- - — - 

iFiaure 7E and 7F) . 



2C 



*-om a cauent with AI-S-KS. wn: 



_ ^ r ^v-q - - 3 sues wa = rcunu oy 
- v - ir us dissemination t^ no..r^ 

= Tinmg a lvmph node fr: 

, " " _- T = vIL- 6 -seaming 
did nc: aevelop . 

. . ~ "-■ - e= vmoh node 

her;a - ODOiet;c cells were present m _...s -yr^i 

^iaure 8C: which was free of KS microscopically. 

^" - = creser.t m 

25 v"L-£ positive ^ympn noae w - 

relatively E-ceil rich areas and some express C20 B 
c~il surface antigen (Figure SD> , but not EMA surface 
antiaen (unlike PEL cells) .Cesarman ez al . , 1395, 



Blood 86, 2708-2714) 



No coloca-i^ation c_ 



t- _ ^-■a«e ^---qo- ""~3 or tne 

3C positivitv with the T ce., » u * 

^ " . mro ,., = e not-ficced. although 



macrophage antigen CD68 was cetecte 
magcevtosis cf vlL-S immuncpos 
macrophages was frequently observed. 



, 5 To investigate whetner the vMI?-I can mnibit NSI 

r*— . — i"! rriov Cells 

HIV-1 virus entry, numan Cu4 + >*--^-. 



(CCC/CD4) were trans 



-rs^efociec with oiasmias 
lent iv _ra--^j 
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expressing human CCH3 and vKI?-l cr its rever_- 
construe: I-?IMv {see CCR5 and vMI?-I cloning^ 
METHODS) . These cells were infected with either M23 
cr SFlc2 primary NSI KIV-1 isolates which are known 
5 use CCR5 as a co-receptor {Ciapham e: al . , l.-rZ, J" 

Virol do, 3531-352 7} cr with the KIV-2 variant ROD/5 
which can infect CD4 + CCC ceils without human CCR5 . 
Virus entry and replication was assayed by 
immunostaimng for retroviral antigen production 
10 ■: Fiaure S). vMIF-I cotransf ecticn reduced NSI KIV-I 

foci generation to less than half that of the 
reverse- construct negative control but had no erfect 
on ROD/3 HIV- 2 replication. 

15 Molecular piracy of host ceil genes is a newly 

recognizee feature cf seme DNA viruses, particularly 
herpesviruses and poxviruses (Murphy, 1994, Inrecc 
Acrer.zs Dis 3, 137-154; Albreoht ez ai . , 1992, J Virol 
66, 5047-5055; Gao and Murphy, 1994, J Bzol Chen 26S, 

z c 22539-28542 ; Chee eZ ai . , 1990, Curr Tor Microbiol 

Immunol 154, 125-169"; Mas sung ez a.1., 1934, Virol 201, 
5-240) . The decree to which KSKV has incorporated 
cellular aer.es into its genome is exceptional. In 
addition to vMIP-I and vMIP-II, vIL-6 and vIRF, KSHV 

2 5 also- encodes polypeptides similar to bc_-~ ■. ~ ; RF _c; , 

eye I in D { ORF 72), complement - binding proteins 
similiar to CD21/CR2 (ORF 4), an NCAM-Iike adhesion 
orotem (ORF K14), and an IL-B receptcr (ORF 74) . E3V 
also either encodes ( EHRF1 /bcl - 2 ) cr. induces ;CR-2; 

3C cyclin D IL-6; bcl -2; adhesion molecules and an 

IL- 8R- like EBll protein) these same cellular 
polypeptides (Cieary er al . , 1986, Cell 47, 19-2S; 
Tosato ez al . , 1990, J Virol 64, 2022-3041; Palmero ez 
a.1., 19 93, On cc a en e B, 1049; Laroher et al . , 1995, Eur 

25 J Immunol 25, 1713-1719; Birkenbach er al . , 19 93, J 

Virol 67, 2209-2220) . Thus, both viruses may modify 
similar host cell signaling and regulatory pathways. 



208 



pcm-'s^/ou-t: 



ESV appears to effect these changes . 

cf cellular gene expression whereas KSKV :r.:r::j:-- 

che polypeptides exogenously from its owr. genome. 

Identification c£ these virus -encoded cellular- li.« 
sci-.Tjeptides leads to speculation about t-eir 
cotentlai roles in protecting against cellular 
antiviral responses. huIL-e :nr '"-" 

■v-interferon-induced, Bax-mediated apoptosis tr. 
mveloma cell Hr.es ! Li chtenstem ez al . , 1 = 55, 
C^-l^lar Inununcloay 162. 243-255! and vZL-6 may piay 
a similar role in infected 3 ceils. KSKV- encoded 
vIRF. vbcl-2 and v-cyciin may also interfere witr. 
hcst-cell mediated apopcosis mducec oy ^ v_rus 
<„f.c-.isn and v-cyciir. may prevent GI cell cyc_e 
infected ceils. Interference with 



MHC antigen presentation ar.c 

_ -QC- 



arrest ci 
in -erf eror. - induce 

eel" -mediated immune repor.se (Holzinger e: a:., 19.-, 
I^uncl La: 25, 109-117) by vIRF is also possible. 
— p-chemokine polypeptides vMI?-I and vMIP-H may 
have aaontst or antagonist signal transduction roles . 
-h-ir sequence conservation and duplicate gene ocsage 
Ire" ndioative of a key role in KSKV replication and 
survival . 

Uncontrolled ceil growtn from cell - signaling pathway 
dvsregulation is an obvious potential by-proauct c. 
n hi = virus strategy. Given the paucity or v.L-o 
expressing ceils m KS lesions, it is unli>eiy tnat 

vIl-6 sicnif icantiy contr^Du^e^ — — 

~ „~ ■ 7 - - however, witn subsequent 

KSHV mcuotion c, nu-i^o, nowe/ "' 

v « r -^ a- endothelial growth 
induction c- vasc - a - 

^ac— media-ed anaiogenesis (Holzinger e: a-., 19-3, 
™noJ Let 35, 105-117), is a possibility. vlL-6 
could also potentially contribute to the pathogenesis 

, . -a-rar i v° disorders such as 

c" KSHV- ^"^ la tea _vmonoprc-i~erai,_ v _ ^- 

PEL or the plasma 'ceil variant of Castleman' s disease. 
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""he cr.coaer.ic cocer.tial of cellular cyr.ir. anc 
overexpressior. is well -escabl isned and these 
virus - encoded polypeptides may also contribute 
KSKV- related neoplasia. 

5 

KSKV vKI?-I ir-hibi:s NS1 KIV-1 replication in v::r: 
{Figure 9) . Studies from early in the AIDS epidemic 
indicate that survival is longer for AIPS-KS pa::er.:s 
than for other AIDS patients, and that 93% ci ITS AIDS 

10 catients surviving >3 years had KS compare c :: on.;.- 

23% of renaming AIDS patients dying within 3 years cf 
diagnosis (Hardy, 1991, J AIDS 4, 336-391; Lemp e: 
al., 1930, J Am Med Assoc 263 , 4C2-406; Rother.oerg e: 
al . , I95' 7 , Nev Sng J Wed 317, 1297-13C2 ; Jaccoscn er 

15 al . , 1993, Am j Epidemiol 133, 953 - 964; Lundgren er 

al.. 1993, Am ^ £piden:iol 141, 652-658?. This -ay be 
due to KS ocouring at relatively high CD4 - counts anc 
high mortality for other AIDS - def mmg conditions. 
Recent surveillance data also indicates that the 

20 epidemiology of AIDS-KS is changing as the AIDS 

epidemic progresses (Ibid) . 

METHODS 

25 Genomic Sequencing. Genomic inserts were ranoomiy 

sheared, cloned into M13mpl6, and sequenced to an 
average of 12 -fold redundancy with complete 
bidirectional sequencing. The descriptive 

nomenclature of KSHV polypeptides is cased on the 

30 naming system derived for herpesvirus saimiri 

(Albrecht e: ai . , 1992, J Virol 66, 5047-505 5} . 

Open reading frame ( ORF ) analysis. Assembled sequence 
contigs were analyzed using MacVector : 131 -Kodak, 
35 Rochester NY) for potential open reading frames 

greater than 25 amino acid residues and analyzed using 
BLASTX and 5EAUTY-3LASTX (Altschui s: al . , 1990; J Mo I 
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173 - 1 54; . ^^^ /dOZ -^-- =r '- Z ^^^~ :T ^S/i]l^ 
search/nucleic_acid-search . »ir.l . - £l - : - sr -- --- 
a^ianed w the four KSHV polypeptides ira.ics: 
included (name {species, sequence bank accession 
nutnber, s-alles: sum Pcisscn dis-ribu^cn prccaci_ity 
sccre)): (1) vMIP'Z: LD7 8 {MI?-la; (human, gi 117077, 
p=S.Sxe-22) , KIP-la iRBZtus, gi 790623, p=2.;xe-20 , 
Ml?-1^ (MUS, gi 127C7S, p=1.7xe-lr., Mlr-li ^f'^ 
154=534, p=7 . Bxe-18) ; (2) vMIF-lZ: LD7 = iM.r-.s' 

. ' --7^77 o = 7 lxe-22;, KI?-Ia \Mus , gi 
I numan , gi - 7j "' ~ 

., 707S( o=8.9xe-21), MIF-lo (Karrus, gi 7906^, 

oll.2xe-20i, MZP-10 (Mus, ci 1346534, ? = 3 . Sxe - 2 0 : ; J 3 ) 

vZL-€: 26 fcDa polypeptide iIL-6; (numar., gi ~S_<-. 

_ = ~ jxe- 17 IL--6 (Wacaca, gi =i^3~c, - - - • 

: 4 vJH-: ICSE? !Ga_Jus, gi=e~3_=, ?- -- 

lMus . sp ?23S11. p-1.0x«-10', . lymphoid spe = :. = i = 

,„- c --a r 3r reaulatory factor ■, A?--, ^ - 

*' ^ r _ „ _ i r- ■ tpt4 

c=2 . Cxe-lC; , 2SGF3 (Mus, gi p-c. --•-■= — - 

, , , n>^_Q' 2SGF2 ; human, sp 

/human, g: _^/2^/, ^ ^ 

Q0097B, 3.9xe-9), (human, sp Q02;x, p = ^.-x=---- 

. . - t, , _ ~ ~ co^^nces were a-ignec 

S-cuence alignment . Am.no — - s_- 

_ ^ - - - ~ -:rid= Res 

usmc CLU5TAL W (Tnompscn e_ - 

22, 4673-466C) and compared using ?AU? 3.1-1. Botr. 
rooted and unrooted bootstrap comparisons prcoucea 
biogenetic trees having all 100 bootstrap replicates 
with viral polypeptides being less divergent from eacn 
other than from the human pclypepides . 

■ -j No-th-rn blotting was performed 
Ncrtnern clott-n-,. u^^^.i 

.,.<nc s-.andard cor.di-.ions wish random- labeled prcoes 

:~ a := « saw 2^, -^--^ «nve e 

ire F« P-duccs for che following pr-e, se.s^ 
vKIP-I: 5' -AGO ATA TAA GGA ACT CGG CG, .„--3 <~W 

^, » * tlt^ CCC CTT TG - 3 ' (SEQ - 

T _ A ~ r AG- ~TC TTC ACG " ' ' ^ 



— » ■ 
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ZD NO: 6), 5'-7GC 7G7 C7C GG7 TAG GAG AAA AG-.-' ^-s, 
NO : 7 j ; vIL-6: 3 * - TCA CG7 GGG TC7 77A 077 A7G G7G - 3 
(SEQ ID NO : S } , S'-CGC CG7 TGA G7G AGA 777 CG7 AA-C-_-' 
(SEQ ID NO : 9 : ; vIR? : 5 ' C77 GOG A7G AA7 CA . CCA 
( SEQ ID NC : 1 C ; , 5 ' -ACA ACA CCC AA7. 7CC GIG 71-3' (SEC 
17 NO: II) or. total cell RNA. extracted with RNAzcl 
according to manufacturer's instructions (TelTest Inc, 
rrier.dswocd 7X ] and 10 ug of total RNA was loaded in 
each lane. 37?- 1, EC-1 and P3HR1 were maintained in 
culture conditions and induced with 7PA as previously 
described { Gao et al . , 1996, New Eng J Med 325, 
233-241). PGR amplification for these viral genes was 
performed using the vMIP-I, vK I ? - I I , vIL-£, and vIRF 
crimer sets with 3 5 amplif icat icn cycles and compared 
to dilutions cf whole BC-1 DNA as a positive control 
us ma ?CR conditions previously described (?<ocre and 
Ghana, 19 95, New Eng J Med 332, 1151-1165) . KS 
spindle cell line DNA used for these experiments was 
described in Dictor et al . , 199€, Am J Pathol 14S, 
2 019-2016 . Amplif lability cf DNA samples was 

confirmed using human HLA-DQ alpha ano pyruvate 
dehvdroaenase primers . 

v I L- - 6 cloning. vIL-6 was cloned from a 6 95 bp 
polymerase' chain reaction (PGR) product using the 
following primer set: 5 ' -7GA CG7 GGG 7G7 77A G77 ATG 
G7G- 3 ' (SEQ ID NO : 12 ) and 5 ' - GGG GG7 7GA GGG AGA GT7 
GGG AAO-3 ' {SEQ ID NO: 13) , amplified for 25 cycles 
usmc the 0.1 ug of BO - 1 DNA as a template. PGR 
nroduct was intially cloned into pCR 2.1 ; Invit roger. , 
San Diego GA) and an EcoRV insert was then cloned into 
the pME77 expression vector (Takebe et al . , 195S, Mol 
Cell Bicl 8 , 466-472; and transferred using 
DEAE-dextran with chlorocuine into CCS 7 ceiis 
(CRL-1651, American Type Culture Collection, Rockvilie 
MD) . The sequence was also cloned into the pMET7 
vector in the reverse orientation (6-LIv; relative to 
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-he SRa pro-oter as a negative 

: _~ j epa"o r .-e fidelitv c f cctr. ccns_. 

confirmed by bidirectional sequencing using aye-?. 

Inc, rcsner City 1A) . 



{Ami con , Beverly MA'.- and 13:- 



r a c r. xane m 



15 ml =f serum-free COST supernatant s were 
concentrated to 1.5 ml by ultrafiltration w::r. a 
Cer _- r;L? l u£ io filter 

cf supernatant: concentrate or 1 uc 
Systems, Minneapolis MN) was loaded per eac 

-i ■ Per cell ivsate immuncbioct mg , 
Laemmii *^---e- - - . 

_ ■ _ '-ca'i s with and witnout ng - r *~ 

exponential p^ase ■ ^ x ^ _ _ 

--ducnicn for 4 6 hours were pelleted and 13'-- -g c- 
wZlVllll protein solubilized in Laemmii buffer was 
loaded per lane, electrophoresed on a ls% 



lei and immunoblct ted and aeve-opeo 
J Med 32S, 233-241) 



SDS -colya cry! amide g* 
us i ng stand; 

with either rabbit antipeptio= 



iard conditions (Gao e: al . , :ve ' v 



antibody (1:130-1:1000 dilution) or anti-nu^-c 
per mi, R&=> Systems, Minneapolis MN) . 

Ce -i - i^ BS- . 29 mouse plasmacytoma ce_- -me wer-= 
mllnca^ee in Isccve's Med, f ,ed^ Dulce = = = ■ -= ^ ^ediur, 

;:KDM'j (Gibco, Gai"hersbur=, KD! , J---% - - = 

seruin, 1% oenicilim/streprsmycin. 1% g-ura-me, ul-. 

5-mercaptoerhancl . and 10 n 3 per nl rr.uII.-6 !R«r 

- ■ it* - - -r", ^ i t-. js. ut-taKe was 

Systems, Minneapolis, MN) . H - ~-/m.— 

used to measure =? proliferation in response to r '^'J\ 

30 cr recombinant supernatant s a- 

protocols (R&D Systems, Minneapolis, MN - Erieiiy, 
serial 1:2 dilutions of nulL-6 or Centnp-us ^ 
concentrated recombinant supernatants were mcunatec 
with 2xl0< cells per well in a 56 well plate for 24 
hours at 37- with 10 M l of thymidine s:ock st.utitn 
{50 ^1 of lmCi/ml ^-thymidine in 1 ml IMDM ) adaeo to 
each^weil during the final four hours of incubation. 
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Cells were harvested ana ir.crrpcraisa 
determined using a liquid sc— illation courier. Earr. 
data point is :he average of six det ermir.at ions witr. 
standard deviations shown. 

vii-c ir.nunos-air.ing. Immunostairung was performed 

using avidir.-biot in complex <ABC) method after 

deoaraf f inizazion of tissues and quenching ::r 20 

minutes with 0.03% H 2 0 2 in PBS . The primary antibody 

was applied at a dilution of 1:1250 after blocking 

with 10% normal goat serum, 1% 3SA, 0.5% Twee- 20. 

The secondary biotinylated goat anti -rabbit antibody 

-2 00 in ?BS) was applied for 3 0 minutes at room 

temperature followed by three 5 minute washes ir. FES. 

Peroxidase-imked ABC (1:100 in ?3S) was applied for 

2 0 minutes followed by three 5 minute washes ir. PBS. 

A diamine-benzidine CDA3) chromogen detection solution 

;C.25% DAB , 0.01% H 2 C : in PES } was applied for 5 

minutes. Slides are then washed, counterstamed with 

hematoxylin and coverslipped . Ammo ethyl caroazoie 

(AEC) or Vector Red staining was also used allowing 

better discrimination of double - laceled cells with 

Fast Blue counters- aming for some surface antigens. 

For ZDSe, in which staining might be obscured by viL-5 

cytoplasmic staining, double label immunofluorescence 

was used. Microwavec tissue sections were blocked 

with 2% human serum, 1% bovine serum albumin (3SA) m 

?5S for 3 0 minutes, incubated overnight with primary 

antibodies and developed with fluorescein - con j ugated 

goat anti-rabbit IgG (1:100, Sigma) for vIL-6 

localization and rhodamme- conjugated horse ami -mouse 

laG (1:100, Sigma) for CD6 6 localization :cr ^0 

minutes. After washing, secondary antibody mcuoation 

was repeated twice with washing for 15 minutes each to 

- • s T-r* pn- the remaining membrane 

amoiiiy s - a — mg . ro- 

antiaens, slides were developed first for vIL-6 ano 
then then se 



econdlv with the cellular antigen, as well 
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verse loc a — — zsticr. i^-^j. — --- 

first, anti-vIL-€ second. _ achieve ^ cp - ima_ 

s'-alizacion and discrimination of occr. ar.t iger.- 
case, "he first antibody was developed using 
( c : ^ a : with blocking solution premcubat icn \-k = SA. 
1C%~ normal horse serum, 0.5% Tween 2 0 for 3 0 -mutes ■ 
and development per manufacturer's instructions. Tne 
seccnd antibody was developed using the AB 0 - a..-:s_ me 
rhcschacase technique with Fast Blue chromager. . Bctn 
10 Lcrowaving and trypsimzat ion resulted m poorer 

realization and specificity ot 

^munclccalization. In cases where this was requires 
f-r optimal localization of membrane antigen, these 
- -z-'icrues were aooiied after vj._i-c A-~ - n - 

V-ctcr-Red (Vector" Burimgame, CAi staining was used 
a = an alternative stain to ABC to acnieve ^cptima- 
discrimination and was performed per manufacturer's 
crctocol using the ABC-alkaline phosphatase tecnnique . 
"--■1 antiaen antibodies examined included ZZ-S* : "- : => 00 ' 

- ~ ~rv-h^ J a~ membrane antigen {=.MA. 

2^ - ~~ c i one Kim o, , e?-- u - 

~ * t. \ f , — '1:10 0, lake ) , 
1:500, Dako, Carpmter^a, , . — - 

— (--200, Dako';, OPD4 ;i:100, Dako). CD:-. 

lake-, CD 4 5 ; 1:400, from clon^- - - 

mmunotecn, Westhrook, KB; and l^eull :1:10C, 

, - - cq--. — = c^eoa^eo 

_ ckmson , San Jose, ^> - 

Soe c 1 1 1 c 



i==S rding -c manufacturer's 
-L-6 -olccaiizaticn was only found wi-h "34 ar.s _D4 = 
-. KS lesions. EMA :r. PEL, and C=2 0 and CD4S ir. lymph 



■ode tissues 



Immunohistochemicai v I L - 6 



localization was performed 

- -o" " «: w~-r or without 4 5 
m exponential pnase o\-r-- => 

~ T?A incubation after embedding in 1% agar m 

~- -=lls were 

= -~m° The percentages o- pos-_^- 

^ , ^,,_ rc — -rr°- random hioh power 

determined rrom ce.: counts - nr -- - 

"* - - ^ i , _; ^ - n We >- cercentacres or 

microscopic ::e.as per si.ae. - owe - ^ - 

----- ceils stair, positively fcr vIL-c after T?A 
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2 0 



:rea:men: possibly reflecting cell lysis and deatr. 
from lytic virus replication induction by . TP.-.. 

of cells and tissues was demonstrated 



Immunostaining 
t o 



be specific by neutralization usm: 



incubation of antisera with 0.1 M g/ml vII_-6 synthetic 
pectides at 4°C and by use of pre immune rabbit antisera 
run in parallel with the post immune sera for the 
tissues cr cell preparations. No specific staining 
was seen after either peptide neutralization or use of 
ore immune sera. 

CCR5 and vKIP-I cloning. CCR5 was' cloned into pRcCMV 
vector (Invitrogen; and both forward anc reverse 
orientations of the vMIP-I gene were cloned into pMET7 
after PCR amplification using the following primer 
oairs: 5 ' -AGC ATA TAA GGA ACT CGG CGT TAG- 3' i SEQ ID 
NO: 14), 5' - GGT AGA TAA ACT CCC CCC CTT TG-2' (SEQ ID 
K0:15). CCR5 alone and with the forward construct 
(vMIP-I), the reverse construct -:i-?IMv; and empty 
oMET7 vector were transfected into C22/CD4 cel.s (CCC 
■cat cells stably expressing human CD 4 , see McKnight st 
al . , 1554, Virol 201, 8-13) using Lipof ect amine 
(GibcoJ . After 48 hours, media was removed from the 
transfected cells and 10C0 TCID ;r of SF1£2, M23 or 
25 ROD/3 virus culture stock was added. Cells were 

washed four times after 4 hours of virus incubation 
and grown in DMEM with 5% PCS for 72 hours before 
immunostaining for HIV-l P 24 or HIV- 2 g ? 105 as 
previously described. Each condition was replicated 
3C 3-4 rimes (Figure S) with medians and error bars 

representing tne standard deviations expressed as 
percentages of the CCR3 alone foci . 
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WO 97/27208 



PCT/US97/0144: 



1C. Goldberg and Fagum, INTERLEUKIN e TO ST_:~ 
ERYTHROPOIETIN PRODUCTION f PATENT NC . 5.155, 
ISSUED February 22, 1993 ; 

11. Miles e: al . , METHOD TO TF.EAT KAPOSI ' S SARI 
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Table 1. KSHV Genome ORFs and their similarity to 
genes in other herpesviruses. 
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121 


50 


. 0 






, 665 


-> — 


. 333 


110 


35 


. 6 






. 667 


73 


,622 


313 


55 






79 


. 448 


"6 


, 765 




64 


. 1 




79, 436 


81 


, 96" 


842 


62 


. 5 



26 . 
42 . 
48 . 
42 . 
42 . 
32 . 
2 5 . 
32 
45 
65 
58 
29 
21 

1 7 



6 
S 
3 
5 
-7 

0 

31.0 
43 . 5 
30.1 
35 . 
15 . 8 
~t — 

31 - ~ 
311 
50.4 
3 9.7 
52 . 1 
26.1 
25.2 
35 
60 
61 
30 
59 
29 
2 4 
21 
24 



23 - 3 
36 . 0 
35 . 5 
46 . 4 
44 - 3 



3A1.F3 
EAL.F4 



HHRF1 
EVRF2 

2VF.F1 
EXRF1 
3XLF1 
HXLFl 
ETRF1 
5cP.Fi 
EcLFl 
EGLF1 



5DLF2 

EDRFi 

EG1.F3 . 5 

5DLF4 

53LF1 

EGLF2 

5GRF1 

EGLF2 

BGLF3 . 5 

EGLF4 

3GLF5 

33LF1 

E3RF3 

SELF 2 

BSLF3 

EERF2 

E5P.F1 

E2LF4 

3KRF4 

HF.RF3 

3KRF4 

3RRF2 

ERRF1 

5R2.F2 

3LRF2 
EL.RF1 
EL1>F2 
BSRF1 
ESLFi 



65.6 
55 . 5 
52 . 1 
™C . 9 



45 . 2 
58 . S 



62 . 
54 . 
50 , 
46 . 
51 
55 
74 
72 
43 



58 . 
47 , 
52 . 
57 , 
54 , 



50 . 0 
60 . 1 
52 . 5 
65.2 
47.1 



52 
67 
67 
4B 
69 
53 
46 
4 9 
41 



9 
5 

. 9 
. 9 
. 2 



54 . 6 
58 . 1 
53 . 7 
61.6 
56 . 6 



42 . 1 

41 . 3 

42 . 5 
55 . 6 



Rail LF2 4 4.4 2 7.5 



22 . 5 

34 . 3 



, 0 
. 6 



42 
34 
29.2 
26.5 
31-0 
27.7 

56 . 8 
48.8 

19 - 6 



42.3 16.3 



36 . 4 
26 . 6 
32 . 2 
40.6 
33.0 

30 . 2 

42 . 7 
23 . C 

43 . 6 

23 . 3 

33.0 

50 . 1 

51 . 1 
26 . 2 
54 . 8 

24 . 2 

ie . 8 

28.0 
19 . C 

36 . 9 
40.5 
32 . 4 

44 . 0 
35 . 4 
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ORF57 
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— - -j 


S3 , 
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2 *? 5 


5 6 
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c 


BMLF1 


i z 
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C Z. 
- D 
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c 3 


86 0 






















Kl 0 




b s-, 


,164 


Sc., 


.074 
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C. ~ 




c 1_ , 
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4 6 7 




















ORF5 8 




q <; 
_r 3 , 




94 , 


471 


t c 7 




Q 


28 


- 


BMRF2 


5 - 




2 5 


- 








7 3 9 


95 , 


549 


3 96 


54 




32 


_ 3 


BMRF1 


5 0 


\ - 


2 £ 




OP TiS n 




5-7 ( 


,757 


96 , 


870 


3 Or 




3 


6 4 


. 6 


BaRFl 










JKf C — 




100, 


, 194 


97 , 


816 


792 


65 


. i 


52 


. 4 


5GRF2 


6 4 




4 3 , 


. 6 


nt 

kJI\*c c *- 






. 1 94 


100 , 199 


3 31 


64 


. 6 


40 




BORF1 






3 4 . 




ORF6 T 




r q r ' 


, 20S 


102 


, 994 


927 




. 1 


32 


. 1 


BOLF1 


4 ** 


. V 


— -i . 


z 


OP "fi U 
v^r*. o *» 




104 , 


, COO 


111 


, 907 


2635 


50 


. X 


29 


. 7 


3PLF1 


4 6 


. 6 


26 . 










,443 


- - «, 


9 21 


170 


60 


. 4 


40 


. 3 


BFRF3 


49 . 


. *T 


2 7 


9 






113 , 


.759 


112 


, 470 


425 


5E . 


_ 7 


34 


. 7 


BFRF2 


5 0 . 




28 . 




ORF5 7 




114 , 


. SOS 


113 


, 6 93 


2 71 


71 


. 8 


52 


. 0 


BFF.F1 


5 2 . 


. s 


3 9 . 


5 


ORF6S 




114 , 


7SS 


115 


, 405 


545 


6 4 . 




45 . 




BFLF1 


5 £ . 


. 3 


36 . 




ORF6 5 




1 1 5 , 


669 


117 


, 346 


22 5 


71 . 


. 1 


53 , 


. 6 


3FL.F2 


6 0 . 




i — . 




K22 




*! " fi , 


101 




, 919 


60 




















K13 
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, 251 
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ORF72 
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56 6 
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, 793 


257 


53 . 


r\ 


32 . 
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ORF7 3 






256 


123 


, 808 


1162 


51 . 


2 


31 . 


. 8 












K14 
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12B 


,92 9 


348 




















ORF74 




125 , 


3 71 
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,399 


342 


57 . 


8 


34 . 
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ORF7 5 




134 , 


440 
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. 550 


1296 


54 . 


8 


36 . 
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K1S 




13 6, 




1 - c 


, 977 


100 





















Name 
Kl 

ORF4* 

ORF6 

ORF7 

ORF 8 

ORF9 

ORF10 

ORF 11 

K2 

ORF 02 
K3 

ORF7 0 

K4 

K5 

K6 

K7 

ORF16 
ORF17 
ORF1B 
ORF19 
ORF2 0 
ORF21 
ORF 2 2 
ORF2 3 
ORF 2 4 
CRF2 5 
ORF2 6 
ORF2 7 
ORF2 8 
ORF2 9b 
ORF3 0 
ORF 31 
ORF3 2 



Complement emdir.g proteir. iv-CB?) 

ssDNA binding protein { SSBF ) 
Transpcr: protein 
Glycoprotein B (g3 ) 
polymerase (pel ) 



v I L - 5 
DHFR 

EHV4-IE1 I 

Thymidyiate s>c:hase (TS) 
vKI?- II 
3HV4-IE1 II 
vKZF- I 

Bel -2 

Caps i d protein I 

Tegument protein Z 

Thvrr.idine kinase (TK> 
Glycoprotein H ;gH) 



Ma-cr capsid protein tMCP) 
Caosid orotem II 



Packaging protein II 
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ORF3 3 
GRF2 9a 
ORF3 4 
GRF3 5 
GRF3 6 
G?.F3 " 
ORF3 9 
CRF3 9 
ORF4 0 
ORF4 1 
ORF4 2 
ORF4 3 
ORF4 4 
ORF4 5 
ORF4 6 
ORF4" 1 
ORF4B 
ORF4 9 
ORF5 0 
KB 

ORF5 2 
ORF5 2 
ORF54 
GRF5 5 
CR.F5S 
CF.F5 7 
K9 
K10 
Kll 
ORF 5 5 
ORF 5 9 
ORFS 0 
GRF 5 1 
ORF6 2 
ORFc 3 
OR.F6 4 
ORF6 5 
ORF6 6 
ORF6 7 
ORF6 3 
ORF £5 
K12 
K13 
ORF72 
ORF7 3 
XI 4 ' 
ORF74 
CRF7 5 
K15 



Packaging protein I 



viral cr::e;r. 
Alkaline excr.u: lease {A£) 

Givccprcteir. M <gK, 1 
Helicase-primase , subunit I 
Helicase-primase , suiur.it 2 

Capsid protein Ill 
Kelicase-prirr.ase , subuni: 3 
Virion assembly pre rem 
Uracil DMA qiycosylase CUuG: 
Glycoprotein L 'gD 



Transa;:ivatcr JLCTP) 



dtTPase 

OKA reclicacior. protein I 
Immediate-early ?ro:eir. .1 <r,r.P- 
vIRFl HCS3P) 



Phosphccrotein 
DNA replication protein II 
Ribonucleotide reductase, sma^l 
Ribonucleotide reductase, large 
Assexbly/ONA saturation 
Tegument protein II 
Tegument: protein III 
Capsid protein IV 

Tegumen- protein IV 
Glycoprotein 

Kaposi r. 
Cvolm I 

Immediate -early protein (IFF) 
OX - 2 t v - adh } 

G-crctein coupled receptor 
Tegumer.: protein / FGARAT 



Legend to Table I. Name {e.g. Kl or ORr4; refers to 
the KSHV ORF designation; Pel signifies polarity cr. 
the ORF within tne KSHV genome; Start refers to the 
cosition cf the first LUR nucleotide in the start 
codon; Stop refers to the position of tne last LUR 
nucleotide ir. the stop codon; Size indicates the 
number of amino acid residues encoded by the KSHV ORF; 
:-!VS%Six indicates the percent similarity of the 
indicated KSHV ORF to the corresponding ORF of 



SUBSTITUTE SHEET (RULE 26) 



WO 97/27208 



126/2 



PCT/L 1 S97/0144: 



herpesvirus saimiri; HVS%.d indicates cr.e percer.- 
identitv c: the indicated KSHV GR.F tc the 
corresponding ORF of herpesvirus saimiri; ZRV Nar.e 
indicates the ESV 0R.F designation; EBV%Sir. indicates 
5 the os r cent similarity of the indicated KSHV ORF to 

the named Epstein -Barr virus OF.F; E5V% Id indicates the 
percent identity of the indicated KSHV 0R.F tc the 
named Eostein-Barr virus ORF. The asterisks in the 
KSHV Name column indicate comparison of KSHV ORF4 to 

10 KVS ORF 4 a (*} and HVS ORF4b (**). The entire 

unannotated genomic sequence is deposited in Gen3ank* 
under the accession numbers: U756 9 8 (LUR! , U756 9 9 
( terminal repeat) , and U75700 (incomplete terminal 
reoeat; . The sequence of the LUR (U7 56 9S) is also set 

15 forth ir. its entirety in the Sequence Listing below. 

Specifically, the sequence of the LUR is set forth in 
5' to 3' order in SEQ ID Nos : 17-20. More 
specifically, nucleotides 1-35,100 of the LUR are set 
forth m SEQ ID NC : 1 7 numbered nucleotides 1-2 5,100, 

20 respectively; nucleotides 25,101-70,200 of the LUR 

are set forth in SEQ ID NO: 16 numbered nucleotides 1- 
3 5,100, respectively; nucleotides 70,201-105,300 of 
the LUR are set forth in SEQ ID NO: 19 numbered 
nucleotides 1-35,100, respectively; and nucleotides 

25 105,301-137,507 of the LUR are set forth in SEQ ID 

NO: 20 numbered nucleotides 1-32,207, respectively. 
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-r- -COMMUNICATION INFORMATION : 
" TaT ~TELEFHONE : :iH : "S-0400 



TELEFAX : (HI : 3 91 -0525 



:i; INFORMATION FOR SEC ID NO : 1 : 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 336 amino aciis 
(B! TYPE: amino acid 
(D) TOPOLOGY: linear 



;n; MOLECULE TYPE : protein 
(xi> SEQUENCE DESCRIPTION: SEQ IE NO : 1 : 
,et Phe Pro Phe Val Pro Leu Ser Leu Tyr Val Ala Lys Lys Leu Phe 



b 10 



Arc Ala Arc Gly Phe Arc Phe Cys Gin Lys Pro Gly Val Leu Ala Leu 
20 

Ala Pro Glu val Asp Pro Cys Ser lie Gin His Glu Val Thr Gly Ala 

3 5 "* 0 

Slu Thr Pro His Glu Glu Leu Gl, Tyr Leu Ar 9 Gin Leu Arg Glu He 



_ r 55 



pcm , S97/om-i: 
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12 £ 

Leu Cvs Arg Gly Asp Arg Leu As? Arg Thr Gly lie Gly Thr ieu 

65 70 
Ssr Leu Phe Gly Met Glr. Ala Arg Tyr Ser -eu Arg Asp H iE Phe Pr = 

8= 50 
- ^ ~v,^ t,. s vai Phe Trr Arc Gly Val Val Gin 3_u Leu 

— ^ — - ; 0 *: ~- - ic= Lie 



Leu Tr 



G 1 v V a. 1 
130 



Dn e L~u Lvs Giv Ser Thr Asp Ser Arg Glu Leu Ser Arg Thr 

ill ~ " -120 125 

t..- -- e --c As- Lvs Asn Gly Ser Arg Glu Phe Leu Ala Gly 

"~ " 140 



, w , w « u Ala His Arg Arg Glu Gly Asp Leu Gly Pro Vai Tyr Gly 

£f — ~ 150 155 

P*e Gin — Ara His Phe Gly Ala Ala Tyr Val Asp Ala Asp Ala Asp 

cl .. ^ v Phe Asn Gin Leu Ser Tyr lie Val Asp Leu He 

~ ' Ho" ' ' 165 l gc 

Lvs Asn Asr. Pro His Asp Arg Arg lie lie Met Cys Ala Trp Asr. Pre 
^ c c 200 - " - 

-i- As- -°u Leu Met Ala Leu Pre Pro Cys His Leu Leu Cys Gin 

I1C " 3 

T> h o — Va 1 Ala Asc Glv Glu Leu Ser Cys Gin Leu Tyr Gin Arg Ser 

;;r - ■ Z2 b Z3S 

Giv Asp Met Gly Leu Gly Val Pro Phe Asn He Ala Ser Tyr Ser Leu 

245 - 50 ' 

-«u -hr Tvr Met Leu Ala His Val Thr Gly Leu Arg Pro Gly Glu Phe 

" 2SC 265 -70 

Tie r~ L~u Glv As" Ala His lie Tyr Lys Thr His lie Glu Pro 



Leu Arg Leu Gin Leu Thr Arg Thr Pro Arg Pro Phe Pre Arg Leu Glu 

2 SC =55 ■ 30. 

"I- Le- Ar- va" S*»r Ser Met Glu Glu Phe Thr Pro Asp Asp Phe 

30i " = " 310 315 32^ 

Arg Leu Val Asp Tyr Cys Pre His Pro Thr He Arg Met Glu Met Ala 




Asp Arg 
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'rc .-.5p .r.r r.- 



ll::—h: :: a-::.: £ = i== 
;r TYPE: £-::.: s = 1 = 
I T 3: : linear 

XILECYLE TY?E: pep- ice 

-7:"—::::: SEC H< KC:2: 

- r_ ^ _ ~ ' 

:.£: * His .--3p .-.r; 



u; information for sec 



NO: 4 



d- SEQUENCE CHARACTERISTICS 



A LENGTH: 24 base pairs 

~vp£ : nucleic acio 
■ : z ; 3 TRAKDED NES 3 : single 
;D> TCPOLGGY: linear 



■■ i i . V. 0 LE COLE TYPE : DNA i genomic) 
-ii; HYPOTHETICAL: N 

viv;. ant:- SENSE: N 

;v : ; SEQUENCE DESCRIPTION: SEC NO : 4 : 

AGCATATAAG GAACTCGGCG TTAC 

;2- INFORMATION FOR SEC- ID -0:5: 

SEQUENCE CHARACTERISTICS: 
I A: LENGTH: 23 base pairs 

TVFE: nucleic acic 
( -; CTRANDEDNESS : single 
;D) TOPOLOGY: linear 

;ii! MOLECULE TYPE : DNA (genomic) 
(ill! KYPCTKETI C AL : N 

iiv: anti -sense: n 

lX i) SEQUENCE DESCRIPTION: SEC ID NO : 5 
GGTAGATAAA tc — - — _ww. _ 



INFORMATION FOR SEC ID HO : € : 

' - i "CUENCE CHARACTERISTICS : 
L rNGTH: II base pairs 
«3i TYPE : nucleic acio 
;C) ST RAND ED NESS : smgie 
;D; TOPOLOGY: linear 

tii) "CLEC'JLE TYPE : DNA (genomic) 
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mi; hypothetical: r; 

liv; ANTI -SENSE: N 

(xi: SEQUENCE DESCRIPTION: SEC ID NO : 6 : 

INFORMATION FOR SEC 1= NC : 1 : 
;-: SEQUENCE CHARACTERISTICS: 
(A) LENGTH : 2 3 base pairs 
{ b ) TYPE: nuclei: acid 
(C) STRANDENESE : single 
;D> TOPOLOGY : linear 

(ii) MOLECULE TYPE: ON A (genomic; 

( ill ) HYPOTHETICAL : N 

(iv; Aim -SENSE: K 

, n | SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
TGCT3TCTCG GTTAC CAGAA AAG 

;i INFORMATION FOR SEQ IC NO : 3 : 

Ii' SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS :' single 

(D) TOPOLOGY : linear 

MOLECULE TYPE : DNA (genomic) 
{ 1 1 1 . HYP OTKET I CAL : N 
;:v! ANT I- SENSE: K 

SEQUENCE DESCRIPTION: SEQ 10 NO : S : 
TCACGTCGCT CTTT A CTT AT CGTG 

;i) INFORMATION FOR SEQ ID NO : 9 : 

i i ■ SEQUENCE CHARACTERISTICS : 
(A) LENGTH; ~-i base pairs 
(3) TYPE: nucleic acic 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: imear 

(n) MOLECULE TYPE: DNA t genomic) 
! i i i ) KYPCTKETI CAL : N 
(iv) ANT I - SENSE : N 

;xi; SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 
CGCCCTTCAG TGAGACTTCG TAAC 

[Z x - INFORMATION FOR SEC ID NO: 10: 
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CHARACTERISTICS: 
7a LENGTH: 2 C base pairs 
; = ' TYPE : nucleic acid^ 
;C- STRANDEDNESS : single 
CD ; TOPOLOGY: Linear 

-.ii: MOLECULE TYPE: DNA (genomic- 

HYPOTHETICAL: N 

;iv: ANTI - SENSE : N 

iXi; SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
'TTGCGATGA AC CAT C CAGG 



INFORMATION TOR SEC ID NO 

/i; =ECUENCE CHARACTERISTICS: 
;a: LENGTH: 2C base pairs 
<E; TYPE : nucleic acid 
;ci STRAIJDEDNIISS : single 
■:c; TOPOLOGY: linear 



(ii; kc: 



TYPE: DNA (genomic) 



N 



; lii , hypothetic; 
;iv Aim -sense: :: 

IX1 ; SEQUENCE DESCRIPTION: 



SEC ID NO: 11 



:CCA ATTCCCCGTC 



12, INFORMATION FOR SEC 10 NO: 12: 
U: SEQUENCE CHARACTERISTICS: 



iA 
■ B 



"LENGTH; 24 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY : 1 mear 



■;o: 

iii! MOLECULE TYPE : DNA l genomic) 
i i i i 1 HYPOTHETICAL : N 
(iv: ANTI- SENSE: N 

(xi; SEQUENCE DESCRIPTION: SEC ID NO: 
,-CGTCGCT CTTTACTTAT CGTG 



(2! INFORMATION FOR SEC ID NO: 13: 

(il SEC YEN CE CHARACTER, I ST ICS : 
(A', LENGTH: 2-1 base pairs 
■ E*- TYPE: nucleic acid 

STRANDEDNESS : single 
:oi TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

<iii) HYPOTHETICAL: N 



WO 97/27208 



13 2 
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iiv ; anti- sense : n 

:>:i> SEQUENCE DESCRIPTION: SEC IT KO:12: 

cgcccttcag tgagacttog taac 

(2) INFORMATION FOR SEC IC NO: 14: 

*i, SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 24 base pairs 
(B! TYPE: nucleic acid 
(CI STRAKDEDNESS : single 
;d; TOPOLOGY: linear 

;ii> MOLE CULE TYPE: DMA i genomic) 

; i i i J HYP OTHET I CAL : N 

(av) Aim -SENSE: N 

(xi; sequence description: sec- id no:14 

AGCATATAAG GAACTC3C-CG TTAC 



(2': INFORMATION FOR SEC in NO: 15: 

(i' SEQUENCE CHARACTERISTICS: 
(A; LENGTH: 2 2 base pairs 
(3) TYPE: nucleic acid 
(C; STRAKDEDNESS : single 
;d; TOPOLOGY : linear 

tiii MOLECULE TYPE: DNA (genomic) 
;iv; Aim -SENSE: N 

!xi> SEQUENCE DESCRIPTION: SEC ID NO: 15: 
GGTAGATAAA CTCCCCCCCT TT0- 

!2) INFORMATION FOR SEC ID NO:16: 

(i: SEQUENCE CHARACTERISTICS : 

(Ai LENGTH: 6C1 base pairs 
(B ) TYPE: nucleic acid 
1C) STRANDEDNESS : single 
;d; TOPOLOGY: linear 

iii) MOLECULE TYPE: DNA (genomic) 

iiiii HYPOTHETICAL: N 

■iv) ANT I - SENSE : K 

(X1 ) SEQUENCE DESCRIPTION: SEQ ID N0:16: 
CGTGAACACC CCGCGCCCCG CGZZZZZZ^C ACCGCGCCGC CCCTCCCCCT CCCCCCGCTC 6 0 

GCCTCCCGGC GCTGCCGCCA GGCCCCGGCC GGAGCCGGCC GCCCGCGGGG GGCAGGGCGO 120 
GCCCGGCGGC TCCCTCGCGG GGCGGGGGAC GGGGGAGGGG GGCGCZGGGZ CCCCGCGCGO 180 



pct/us9"/ou4: 
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i::fo?-KAt:3:; for seq 



NO : I 7 : 



= -;-UEI"E CKARAGTEF-ISTICS : 

*" ,; A LENGTH : 3 510 0 base pair s 

TV PE : nucleic acid 
■C STP-ANOEDNESS : double 

--POL.OGV : linear 



MOLE 



3'JXE TYPE : PN A (genomic) 



24 : 



CG3333A333 3A3333GAGG GCCC3CGCCG G3333CAG3G GCGGCGGAGG GCGCGGGGGC 
CCGAGCC3G3 AGGGGGGCCG GGGTACGGGG CTAGGCGACG AATAATTTTT TTTTGGG303 2ZZ 
GCCC===GAA G3T0TGTCGG CCGC3CGGTC CGCGCGGCG3 GCGGGCG3G3 3333333333 36: 
ItI1^3A33 GGGGGGGGGA TGCGGCGGGG G3GGCG333G 0GGGGGCGG3 GGG3CTT33T 12Z 

G - GGC CG 333GGGCGCG AGCCGCGCGG GGGCGGCGGG G3C3CCCT32 

CCCGGGGGGC TC3GCGGGGG GCCCCCTGTC CCCGCGCGGG CCGGCGAGC3 ZZG^ZZ^ 
„_~~.— GA T333G3GGGC GZZZZGZZZZ CCTGCCGGGG ACGCCGCCGG GCGTGGGGG3 
C2T333G33C GGG3ATGGGG CCGCGCGCCG C3TCAGGGC3 CGGCGGGGCC G3C3C" 
— — G3CCC0 33 33333333 GAACCCGGGC AGCGAGGGAA GGGGG3G333 TG3 



4eC 

54 ■: 
5C-: 

78 C 

8c; 



(xi ' SEQUENCE DES GF.IPTIOII : SEC ID NO: 17 
TA3TAATTTT CAAAGGCGG3 GTTCTGCGAG GCATAGTCTT 
TAAACCTGT 3 TTT3AGACCT TGTTGGACAT 33TGTACAAT 
TCTGCAGTCT 33 CG3TTTG 3 TTT3GAGGA3 TATTAAGCCT 
ATTTGTG 333 TGGAGTGATT TCAAGGCCTT ACACGTTGAC 
TGC3AATAT3 GTGGTATTGG AACAATA3T3 GG3TTTTG 33 
TTCTTGACAC CATTGCCTG 3 AATTTTACTT GTGTGGAAGA 
TTTGGATTA3 ATGGCGTG3A CAAC3TGT3T TACAAACCTT 
CAGT3ACTT3 T G G T C AG CAT GTTACTTTGT ATTGTTCTAC 
TTTGGCATTT AC 3 AAACGG A 3GAAATGAAA C3GTGT3ACA 
CGCTGATGA3 3 GAAACTGAG GGGTGTTATA CTTGTT CTAA 
CAAATCGTAT ATGTTTTTGG GCGCGTTGTG Z C AATATAAC 
CTGTCAGCAG TACTACAGGC TTTAGAACAT TGAGTACTAA 
ATGCAAC 3 A3 A3GTGATGTA GTTGTAGTGA AAGAAGCAAA 
AAGTGCATTT T3TTGTATTT ATGACACT3G TAGCTCTGAT 



TTTTTCTGGC GG333TTGTG 
CAAGAT3TT3 CTGTAT3TTG 
TTCTCT33TA TC3T3TC3AA 
CT3T3T3T3T AATGCAT33T 
AGTGACGGAG AGAAGAGTCA 
AT3TGGG3AT CG AC AG AG C A 
GTGTGCACAG CCATCAAACA 
3T3TGGAAAT AATGTTACCG 
AACTAAATA3 TATAATTTTA 
CGGGCTGTCG TCTCGGCTGT 

TAG 3TTAGTG AAGATAATC3 

AGGAACCATG TGT3GTATCT 
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TAGGAACTAT TATCTTTGCC CATTGTCAAA AACAACG . G~ C. ^ 
AACAATTGCA G G ATT ATT AT TCCCTACACG ATTTGTGCAC GGAA3ACTAT ACGCAACCAG 
TGGATTGGTA CTGACATTCA GGTAAGATAA T CTAAAT ATT CTCTATAACA TAATTGTAA7 IZZZ 
g ^ J -...,,. 1TATG t-TATAGCTA CAAATGTTTT ATGCAAAATA CATTTTATGA GGTCG3ATAC 
TTATTAAAAG CATTGTCTTA AGTACATTAA AAGGACATTG TATAACCGTG CTACTTACAG 
CATGGCCTTT TTAAGACAAA CACTGTG GAT TTTATGGACA TTTACCATGG TTATTGGCCA 12 00 

GGACAATGAA AAGTGTTCCC AAAAAACCTT AATTGGATAT AGACTTAAAA TGTCTCGTGA 126 C 

CGGTGACA77 GCAGTTGGAG AAACAGTGGA ATTACGTTGT AGATCTGGAT ACACTACTTA 12 ZZ 

TGCCCGCAA7 A7AACAGCAA CATGTTTACA AGGTGG3ACG TGGTCTGAAC CAACG3CAAC 12 8: 

ATGTAACAAA AAGTCCTGTC CAAACCCAGG TGAAATACAA AATGGAAAGG TTATATTTCA 144 : 

TGGTGGACAA GATGCCTTAA AATATGGGGC AAA C ATTT C A TATGTTTGTA ATGAAGGATA 150C 
TTTTT7GGTT GGTCGAGAAT ACGTGCGATA TTGTATGATT GGAGCATCTG G C CAAATGG Z " 1 = 5 C 
GTG3~CATC7 7C7CC7CC77 TTTGTGAAAA AGAAAAGTGT CACAGACCGA AAA7CAAAAA 16 2 0 

TGGAGATTTT AAGCC7GA7A AAGATTATTA T 3 AGTATAAT GATGCAGTTC ATTTT 3 AATG 
TAAT3AAGGA TATACTC7AG TTGGACCACA TTCCATTGCA TGTGCAGTTA ATAACACGTG 
ATGCCAACCT GTGAACTCGC AGGCTGTAAA TTTCCATCGG TGACTCATGG 
CAAG3TTTTT CTCTTACTTA T AAA CAT AAG CAAAGT3TTA CTTT73CA7C- 
TTTGTTC7CA GAGGATCCCC CACAA77ACG 7G7AACG77A C73AA73GGA 
CCCACCAC77 CCTAAGTGTG TTTTGGAAGA 7A7AGA7GA7 C C AAA 3AATT GAAA7 Z 7TGG 1°8C 
ACSTTTGCA7 CCAACAC33A ATG AAAAAC Z AAATGGTAA7 G7C7T7CAAC GCTGAAACTA 2 04 0 

7ACAGAACC7 CCAACAAAGC C7GAAGACAC CCA7ACAGCA GCTACTTGTG A7A3CAAC7G 2 ICC 

TGAACAGC3A C CTAAAAT Z Z TGCCAACATC CGAAGGTTTT AA7GAGAC7A CCACATCTAA 216 0 

TACAATTACA AAA 2 AATT AG AG G ATG AG AA AA CT AT AT C C CAGC CAAATA 3 AC AT ATT AC 2220 
ATCTGCCTTA ACATCCATGA AAGCGAAAGG TAACTTTACC AACAAGACCA ATAACTCTA7 
TGATCTACAT AT AG C G T CT A CACCCACTTC C C AAG ATG AT GCTACGCCTT CAATACCTAG 
TGTACAGACA C C C AATT AT A AT ACTAACG C ACCGACACGT ACACTAACGT CT CT C CAT AT 
TGAAGAAGGC CCA7CCAA7T CTACTACTTC AGAAAAGGCC ACTTCCTCTA CTCTCTCACA 
CAACTCACAC AAAAATGACA CCGGAGGCAT ATACACAACA TTAAACAAAA CAACACAGTT 
GCCATCCACT AATAAACCTA CAAACAGTCA AGCCAAGAGT TCCACTAAGC CAC3CGTTGA 
GACACACAAT AAAACAACCA GTAATCCTGC CATTTCTTTA ACAGATTCTG CAGATGTGCC 
TCAGAGACC3 C3AGAACCAA CACTCCCTCC C ATTTT CAGG CCACCGGCGT CTAAAAATCG 
CTATCTGGAA AAG C AA C TAG TTATTGGACT ACTAACCGCT GTCGCCCTAA CGTGTGGACT 
GATTACCTTA TTTCACTATC TGTTCTTTCG TTAGCCTAGA ACTTGCTCCA GTGTTAGACA 
GGGCTATGAT TGCTTCTCCA CGCTGTCCAC CTTAACACTT CCCAATAACA AATCCGGTAT 
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.-AGGAGG37 GA7A77AC7A A7G7AACC7A AAAAA7G7GC A7G7337A7G 7A7737A... 
AAGA7AGCGA "AA7A7AAG ACAAC7AA7A 77AACGA7AG 7373G3777C T7737A7AA. 
A7AC3C37G7 33GAAAGCGA CAGAA33GGG CGGCG77TCC A7A7GAG3" AA37GrA77G 

G"AT~TAr- sgscsstsa: :acc-cactat astgcgcg— G-G3CAGAAA a77.a=a"; 

TATATAAACA A33AAAG3G3 ACT=TSCG=G C77AAGCGCC AAG3GA77A7 ACACACGGGT 
T7777G77G7 C77GGCCAA7 7GTG7CTCGA TGGC0C7AAA 3G3ACCACAA A===.=3AaG 
AAAA7A77G3 3TG73CGGCC C7GACT3G7C CCTGCGGG7A CC7G7A73" TAT=T3 ~*= 

AGAAr—rrr cataggggaa gcctggctgc tgggcaatgg gtacccgga^- gcaaaag7at 

TTTCACTAC: TGTTTTGGAG 3GG"=ACA3 7GGAATCCGA TTTC""TA AA"TAAA3G 
C3GTG=A=AA 3AAAA7CGA7 GSAACCACAG C7T7TGTGAA 



AGGCCA7C37 "77GA7AA7 AC77ACTTAT TTCAGCCAAT C777CA 

T 7G AG AG AG 7 7GAGAGCTGT TTGGA7.- 
:-C-GAC G77C7GGAG7 CCA 




AACGT7T7TT 



C7TG73 7G AA-un--.- 



TTATGGCGG7 7ATAGTTACA GAGGGATTCA AG GAG AG ACT GTACGGCGGC AAA77GGTG7 
77GTGCC7T7 TCAGACAACG 2C7G7ACA7A TTGGGGAACA 7CAGGC7-7T7 AAGA7A777. 
TG7A7GA2GA GGATCTGTTT GG7C7AAGTC GCGC7CAAGA ACTATGTAGG 7TTTA7AAC7 
CCGATATCAG TAGA7A7CTA 7A7GA7TC7A TATT7A7TG7 AATAG7A7AG G777TAAGGG 
TAAAGGAC37 7AG7ACGGT7 ATC7AAGC77 CAGAAAGG C A ATTTGTG7AC GA77AA7A7A 

G3T7TAC777 AATGGTGATA GACAGTCTG3 TGGCTGAAC7 TGGTATGAG7 ^ ^ ^ 
C 77TT ATTG A GGGA7C7CAG GATAG7TG7G AGG7TC7AAA 7TA7GACACG .G^77_~.7. 
TTGAAAAC7G 7GAGACG7CA GATG7C7G77 7T7GTGCA77 AGAAG777G3 C~_— ««« - 
AGGC7TTG7A 7A77GGCGCC CAGCTGTTTG CGG7CAAC77 TGTG7T77A7 C7GACCAGAG 
TGGCAAAGC7 GCCTCAGAAG AAT CAGAG AG GAGACGCCAA 7ATG7ACAA7 77ATT7TAC7 
TACAGCA7GC- C7TGGGATAC 7777*7 AG AGG CAACAG7AAA GGAAAATGGA G77T77GC77 
TCAAGGGCG7 GCCAGTG7CT GCACTGGA7G G G 7 CAT 7TT A CA777T77AG 7AC77GGC77 
ACGCGT7777 TT77TC7CCA CATCTC7TGG CAAGGA7GTG TTA77ATCTG 7AG7T7T-G7 
7CCACCA7A^. AAACAC 7 AAC AGTCAGTCA7 ACAATGTGGT GGA77A73TG GG7A77GCGG 
CA77TAGT7A AA7GTGTGA7 7TGTGTCAG3 GGCAATGT77 AG 7 T 3 TAT G 7 AT 7 AA7 A 7 G 7 
TGTTT7ACAG GA7GAAGGA7 AGGTTC7CA7 CTGTTCTGTC AAACGTTAAG ~G^~---~T 
ATGTGATCA7 GGG7ACAG7G GGAACGTACA ATGAC77AGA GATT7T7GGA AA7TT7G77A 

T-ACSG3. GAGAGAGGAG GAGGGGAATC CTGTGGAAGA 7G7T77AAAG TATA7A7AT7 

GGCAAC7A7C- CCAGAATATA AC7GAGAAGC TAGCGTCCA7 GGGCA7C7CG GAGGG7GGCG 
A _ 3CCCT ^- AACCCTCATT GTGGACATC7 CCAG7TT7G7 CAAAG7GTTC AAGGGGATAG 
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ACAG2A2337 AGAGGCAGAG 2722^ AA 

AC77CAGAGA GAACATCAAA 7CCGT2CA7C ACA72C772A G777GCA7G2 AAC2-7A7A27 
GGCAGG2G22 G7G22CGG77 777C7GA2CC 777AC7ACAA G72A77G27G ACG2-72A7A2 512 : 

AGGA2A7A7G 7G7GACG7CA 7G7A7GA7G7 ACGAGCAGGA 2AA27CGG22 G7GGGAA77G S16Z 
TACCATCC 3A G7GGC77AAA A7GCAC77TC AGACAATGTG GACCAA77T2 AAGGG7G2C7 5 22C 

GC772GA2AA AGGAGCAATC ACGGGCGGGG AAC7AAAAA7 AG7CCACCAG T2CA7G7727 52 EG 

ZT-^—C-Z 7GA2ACCGAC GCTGCCA7AG GAGGGATG77 TGCACCCGC7 CGGA7G2AGG 5 34 0 

7CAi - GAT - i . 3;: cagag;:aatg CTCATGGTTC CAAAAACCA7 AAAAA7AAAA AACAGGA7CA 54 c : 

--^ C7C2ACCGGA GCAGAG7CGA TCCAGGCAGG T77TA7GAAG CCGG22AGCC 54 6 C 

AAAGGGA772 A7ACA7CGTC GGAGGACCC7 ACATGAAAT7 C27AAACGC2 C7G2A2AAAA 5 52C 

CA277777C2 77CCACAAAA AC7TCTGCCC TG7AG77GTG GCATAAGATT GGCCAGACCA 5 58 0 

-AAAAAA722 2A7AC7ACCA GGTGTCTCGG GGGAACACCT AACGGAG77A TGTAA7TA7G 564 C 

7AAAGGCAAG 7AG7CAGGC7 77CGAAGAGA T AAATG TTTT GGACC7TG7G CCAGACA22C 57C0 
7GA2A7CA7A 7GCGAAAA7A AAA27AAA2A G77CCA77C7 CCGGG777G2 GGACAGACAC 576 0 

A . 3T _-^ T3C aacTACTCTC 727TGCC777 CGCCAGTGAC TCAG77GG7T CCGG22GAGG SS20 
^ z: ^ czcc ^ CGTACTGGGG 22AG7GGGGT TGTCAT2TCC AGATGAATAC AGGGGAAAAG 53 8 C 

7C322GG2AG G7G7GTAAC2 A7TGTACAGT CAACACTGAA GCAAGCTG77 TCCAC2AACG 5 94C 

Q ^ z „ zrzz - G227A7CAT7 AC2G7GCCAC 7GGTGG7CAA C AAAT AT A 2 A GGGAGCAACG 
GGAA2ACAAA 3G7C777GAC TGTGCAAAC2 7GGGA7AC77 CTCGGGGAGA GGGGTGGACA 
GAAA7Z72AG G22AGAAAGC G7CCCC77TA AAAAGAATAA 7G72AGC7C7 ATGCTAAGAA 6220 
AACG22A2G7 GA77A7GACC CCCC7GGTAG ACAGGC7GGT AAAGAGAA7A G77GGCAT2A 618 0 

AC7Z7 3G3GA A77CGAGGCA GAAGCGG77A AGAGAAGTG7 GCAGAATG72 C7GGAAGACA 
GAGA7AACCC AAACC7GCCG AAGA2AG7TG TAT7AGAGTT GGTTAAGCCA CC7CGG7GGA 
GC7CCTG7GC AAG7C7CACA GAGGAGGACG TGA7T7AC7A C77GGGCCC7 7A7GC2G7AC 
77GGGGACGA GGTCC7GTCA 77AC7GAGCA CAGTGGGCCA GGGGGGGG7G CCATGGACGG 
72GAGGG7GT GGCC7CGG7C A7CCAGGACA 7AA7AGATGA 77GCGAG77A CAG777G7GG 
G222AGAAGA G2C77GCCT7 A7CCAAGGAC AG7CGG7AG7 GGAGGAG77T 777CCG7CCC 6 54 0 

CGGGCG7CG2 AAGCC7GACA GTGGGTAAAA AACGAAAAA7 CGCAT2CC7G C72777GACC 66 00 

7GGA77TGTA G77GTGTACG CG7AACGATG GCAAAGGAAC 7GG2GGCGG7 C7A7G27GAT 
G7372AGCGZ 7AGC2ATGGA G27C7GTC77 C7TAG7TACG CAGAC2CGGC AACAC7GGA2 
A77AAAAG7C 7GGCCC7CAC TACAGGGAAG 777 2 AG AG CC TTCACGG2AC ACTA27CCCC 
CTCG77AGAC GACAAAACGC ACACGAA7GC 7CAGGTC7G7 CAC7AGAA77 GGAGCA7777 
7GGAAAACG7 GGCTGATGC7 C7GGCCACG7 TGGGAGTG7G CAC7AGCAGA AAAC7G7CTC 
7AGAAGAGCA TTTTTCCCTC CTGCA77TGG ACACAACA7G CAACAAGCAA C2GGAGCG77 
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GAGGG777AG GGGAGG7A77 GCG777CG77 GGGAAG77GA GG3G7A777Z 



G3GCCGGAC2" 7A7A7G7772 AAA727GCGG 7GCC7AGAA7 

i=Trc=AAC = AGGGCAGCAG 7C7G2.AGGCC A7GG7CCGAG ACACGGCC7G CAG7CA2A7A 
7G7AGGCCCG CA7GGGG7GA GCGTG7GCGG GGCC7C777G AG AACG AG C7 AAAA2AGC7Z 
GGGCTTCAAA CGCC7GAG7C CA7ACG7AC7 ACCCCG7G7C AGTCGCGGG7 AAGGCAAGAT 
GA7GAAA7CA GA 2 AG AG GT C 7G7AA7GGGG G7AGGAGA7C ACCACA7777 CGGAGA337G 
icCAGATGTG 7GC7GGAAA7 C7ZAAACC7G A7C7A77GGA GC7C7GGCCA C72GGA7G2C 
ACC7GCGACG GAGACAGAGA CTGCTCTCAC C7GGCC7CGC 7G777AC7CA CGAGG77GAZ -5c- 
A7GGA7AAAA GGGGCG7CGA C77GGCCGGA TGCTTGGGCG AACGCGGGAC GCCGAAAGAZ 762 1 

GAAACC7777 7C7G7GG7GG 7C77777AGC 76 EC 
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777777GA27 GG777 

, - - — r , ~- - AAGGACGC7 C77C7GCG77 C7AGCAAG^ 

7CGG7GGAGG ACACCA-~^ ^ 

G-AAA77ACA C7AC7G7AC7 G CAAAAAC AG AACGAG7777 ACG72CGAC7 C AG C AAA 27 G 

C7GGCAGG7G G 7 C AG 27 AAA 777GGGCAAA TGTTCCAC^ A~«--u^ ~~ 

CG7AGGCAG2 7GG7AGG7GG GG32AAACGA GAGGAA37GZ 7GAGGGA7GC AAAA2AZ2GG 792C 

"AAGAAG7A7 A7Z7TCAGAA AG7GGCAGGC GACGG7777A AAAAAC7772 7GA77G7A.- /9oO 
AGA C AG C AG G GGCAGA7C77 G7G7CAGA2C C7GGG7C7AA GAC7G73GGG 

7ACAACGAGG 2A7C7GG727 A2AAAA2CAZ 77777ACA2A GAGCACAG77 CA7A7CZG7- 

- — r TG AAAA G7AA A7A7A7GAAA 8l£C 

CCG7GGCAGG AC77GACGG7 CGACTo^-- ^o-oo.^o 

AA77C777G7 ACTGCGAGCG 727GGGGGGG GAAGACG7AG AGA7CG73AC AC7GGAG7 
7ACAAAC77A T2ACGGGC7C GC7G7CAAAG CGAGA7A777 7A777C22AG 7 G GT C 2 AAA7. 
GTGACGG7GG G7CAG7GC77 CGAGGC7GCG GGCA7GC77C CCCATCAAAA GA7GA7GG7^ 
7CAGAGA7GA 7G73GCCGAG GA7AGAGGGG AAGGAC7GGA 7AGAGGGCAA C-7CAA2ZAG 
77G7A7AGG7 77GAGAA7CA AGACA7AAAC CA7C7GCAAA. AGAGAGC77G GGAA7A7A77 8460 

_ -t—C^- AA-AGAAC77 GGGAGAGGGA GC7AAAAA7A 

AG AG AG G7GG 7A77A7C«^ - .^-T- aa_~-~~~ 

C77C7CACGC CTCAGGGC72 A2GGGGG777 GAGGAACCGA AACCCGGAGG A27CA2AACG 
GGGC7G7A77 7AACA777GA GACA7G7GCG CCC77GG7G7 7GG7GGA7AA AAAA7A7GG3 
7GGA7AT77A AAGACC7G7A CGCGC77C7G 7ACCACGACC TGCAAC7GAG CAA22ACAA7 
GAC7CCCAGG TC7AGA77GG CCACZG7GGG GAC7G7CA7C C7G77GG777 GG7777G7GC 
AGGCG GGGCG CACTCGAGGG GTGACACC7T TCAGACGTCC AG77CCZCGA CA2Z222AGG 
A ™„ TCT AAGGGCCCCA CCAAACGTGG 7GAGGAAGCA 7C7GG7CG7A A3AG7G7GGA 
C777TACCAG 77GAGAG7G7 G7AG7GCA7C GA7CACCGGG GAGC777773 GG772AAC77 
GGAGCAGACG 7GCCCAGACA CGAAAGACAA G7ACGACCAA GAAGGAA777 7A77Z-G7G7A 



8 222 

3 2 s ■:■ 

S34 0 
84 C 2 



6 52 2 



85 4 0 
3 7 C 0 
B7£2 
8 6 2 0 
8SEC 
694 2 
9000 



WO 97,27208 



PCT/T/S97/0 14-12 



125 



A7A373CC77 A7A7C77TAA GGTGCGGCGC TA7AGGAAAA 77GCCA3C77 
TGTCACGGTC 7ACAGGGGC7 7GA2AGAG7C CGCCA7CACC AACAAG7A7G AAC7CC33AG 
ACGCGCGGCA C7C7A7GAGA 7AAGCCACA7 GGACAGCACC TATCAGTGC7 T7AGT722A7 
GAAGGTAAAT G7 7AACGGGG 7AGAAAACAC ATTTAGTGAC AGAGACGATG 77AACACCA2 
AG7ATTCCT3 CAAC7AG7AG AGGGGC7TAC GGATAACATT CAAAGGTAC7 T7AG2CAGC7 
GG7CATC7AC GCGGAACCCG G 7TGG77TC Z C G G CAT AT A C AGAGTTAGGA CGAC7G7CAA 
77GCGAGATA GTGGACATGA TAGC2AGGTC TGCTGAACCA 7ACAATTAC7 77G7GACG7C 
ACTGGG7GAC A2GG7GGAAG 7C7G2CCT77 7TGCTA7AAC GAATCC7CA7 GCAG 3ACAA7 
CCCCAGCAAC AAAAA7GGCC TTAGC3T==A AGTAG77C7C AACCACACTG TGG72A237A 
CTCTGACAGA GGAA3CAG77 CCACTCCCCA AAACAGGATC 777GTGGAAA CGGGAGCG7A 
C AC G 27777 CG TGGG 3C7CCG AGAG CAAGAC CACGGCCG7G TGTCC3CTGG CAC7GTGGAA 
AACC77CCCG CG373CATCC AGACTACCCA CG AG G AC AG C 7TCCAC77TG TGGC2AACGA 
GATCACGGCC ACC77CACGG C7CC7C7AAC GCCAG7GGCC AAC7TTACCG ACACG7AC72 
TTG7C7GACC TCGGA7A7CA ACACCACGCT AAACGCCAGC. AAGG C C AAA C 7GGCGAG CA2 
TCACGTCCCT AACGGGACGG 7CCAG7AC77 CCACACAACA GGCGGAC7C7 A77TGG7C7G 
GCAGCCCATG 7C22CGA77A AC77GACTCA CGCTCAGGGC GACAGCGGGA ACCCCACG7C 

~. G"A m CCCCCATGAC CACCTCTGCC AGCCGCAGAA AGAGACG3T2 

AGCCAGTAC Z GCT3C7GCCG GCGGCGGGGG GTCCACGGAC AACC7GTC77 ACACGCAG77 
GC - vGTTT: . ::c T;V — 2AAAC 7GCGGGATGG CATTAATCAG G7GT7AGAAG AACTC7CCA3- 
GGCATGGTGT CG Z GAG CAGG 7CAGGGACAA CC7AATG7GG 7ACGAGC7CA G 7 AAAA7 C AA 
CCCCACCAGC G77A7GACAG CCATC7ACGG TCGACC7GTA 7CCGCCAAG7 TCGTAGGAGA 
CGCCATTTCC GTGACC3AG7 G7A77AACG7 GGACCAGAGC 7CCGTAAACA 7CCACAAGAG 
CCTCAGAACC AA7AG7AAGG ACGTG7G7TA CGCGCGCCCC C7GG7GACG7 7TAAG77777 
GAACAGTTCC AACC7ATTCA CC3GCCAGC7 GGGCGCGCGC AATGAGA7AA TACTGACCAA 
CAACCAGGTG GAAACC7G2A AAGACACCTG CGAACACTAC TTCATCACCC GCAACGAGAC 
7 CTG G7GTAT AAGG A C7 AC G CG7A3C7GCG CAC7A7AAAC ACCACTGACA TA7CCACCC7 
GAACACT777 ATC 3 CCCTGA A7C7ATCC77 7ATTCAAAAC A7AGACTTCA AGGC3A7CC-A 
GCTGTACAGC AG73CAGAGA AACGACTCGC GAGTAGCGTG 77TGACC7GG AGAC3A7GT7 
CAGGGAG7AC AAC7AC7ACA CACA7CG7C7 CGCGGGTTTG CGCGAGGATC TGGACAACAC 
CATAGA7ATG AACAAGGAGC GC77CGTAAG GGAC77GTCG GAGATAGTGG CGGACC7GGG 
TGG CA7CGG A AAAA 3 G GTGG 7GAACGTGGC C AG C AG C G TG G7CAC7CTAT GTGGC7CA77 
G3T7ACCGGA T7CATAAA77 7TATTAAACA CCCCCTAGGT GGCA7GCTGA TGA7CA77A7 
C G 77 A7 AG C A A7CA7CC7GA 7CA7T777AT GCTCAG7CGC CGCACCAATA CCATAGCCCA 

_,^ ( , - ^ Tr - — - — GCACC7GCTA GCGGCGGAGC 

GGCGCCGG7G AAGA7GA7C7 A- oA^oT MunTCo. rt uo ow~^. 
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C22AAGACGG GAGGAAA7GA AAAA CAT C G7 GGTGGGAATG 

~ GG GTGTTTGAGC GTAC2GGAAA 

GAGG2AGAAG GCGGATGA7C 7.,~~~— — 

CGGGGTTCGT CAGCGTC7GA GAGGA7ATAA ACGTGTGAGT CAA77GC7AG ACAT2AG722 
G G AAA" G GG G GAGTGAGAG7 GGA7TGGAGG TTATTG7T7G A7G7AAA7T7 
G"""G' — T7 GAAGGAGCAZ ATrtCnon^-.y - - rt * wvl 

2 A C AAATT A 2 CGTGCGCAGA 7GA7GGA777 7TTCAA72CA 777A7CGACC CAAG72GCGG 
AGGGG2GAGA AACACTG7GA GGCAACCCAC GCCGTCACAG TCGGCAACTG TCCCG77GGA 

T - r . - TG TTTCCAAACC GCGGGG2GAC CCGGG3TG3T 

GAGAAGAGTA T3CAGG---~ 7~C-^_^ 

7G7CG7GGAC ACCAGATTTC GA.CGCaCCT^ ■ ■ 

CGC333A3AG AGTGGGTGTA TCTGGAAAAC AAGGC3G3GA CAGGCAGG3A ATG~"TAT 
GTCGCA"" ATATTCCA" 7ATACGACA7 CGTGGAGACC ACGTACACGG C=3A"GGTC- 

.„ ». — -a TATCATTCCC AGCGGCACCG T"7GAAGC7 

GGAGGAC37G CCATTTA^C. 

. A7AC7AGA73 GGGCCAG7G7 CTGCG7SAAC GTTTTCAGGC AGCG77G3.A 

^1~~~ C :, -MCACCC: AGGGGGTAAA CC7GACCCAC GTCCTC3AGC AGGC"7"A 
GGCTG3C777 3GT -GC3CA7 CCTGCGGCTT C7CCA7CGAG GCGG73A3AA AAAAAATC77 
G33C3C37AC 3ACACACAAC AA7AT3CT37 GCAAAAAATA ACC7737GA7 CCAG7333A7 
GATGCGAAC3 777AGCGAC7 GCC7AAGAA3 CT3TGGGT3 C GAGGTG777G AG7GCAA7G7 
GGACGCCAT7 AGGC3C77CG TGC7GGACZA C3GGT7C7CG A7A7TC3G37 3G7ACGAG7G 
CAGCAA7"G GCCC==CG=A GCCAGGC3A3 AGACT"73G ACGGAA-GG A377;GAC7G 
CAGC7G33AG GAGC7AAAG7 77AT=C"GA GAGGAGGGAG 73GCC"7AT AC7CAA7C77 
A7C777TGA7 ATAGAATGTA TGGGCGAGAA GGGTT777CC AACGC3AC7G AAGACGA3GA 
A GAAA7C7CG7 3TGT777ACA CACAG7CGGC AACGATAAAC C3TACACCCG 
7A773 33CCTGGGGA CATGCGACCC =CTT==TGG= G7GGAGG7C7 77GAGT77CC 

n n CAGCATGG7C CGC3A77ACA A7GTGGA3T7 

TTG33AGTAC G A CAT 3 C7G ■ j CCG-_-; 

7A7AACG3GG TACAACATAG CAAAC777GA CCTTCCATAC A7CATAGCCG GGGCAAC7CA 

* AC^GGGTCCG TGTTTGAGGT 

GGTGTACGAC ttcaagctgc aggag.t.— ca~~ — 

CCACCAACCC AGAGGCGGT7 CGGATGGGGG OlAC . - ^> >■ 

^ATATrGGGG ATCGTCCCCA 7AGACA7G7A C7AG37773C AGG 3 AAAAG C 7GAG777G7C 

AGAC7ACAAG C7GGAGACAG 7GGG7AAGGA A7GCC7CGGT CGACAAAAAG A.GACATGTC 

__. . , , — TGA- GG7C3C3CAA AG37GG3AAA 

ATACAAGGAC ATACGGGCGG 777TT~~- - .G^~- 

„„ T3T3TT ATTGACTCGG TGC73G7TA7 GGA7C77C7G C7ACGG777C AGA3GCAT3T 
TGAGA-CTC3 GAAATAGCCA AGCTGGCCAA GATCGCGACC CGTAGGGTAC T3AC3GACGG 
CCAACAGATC AGGGTATT" C^GCGTCTT GGAGGGTGGT GCCAC33AA3 GTTACATTGT 
CCCCGTCCCA AAAGGAGACG C3GTTAGC33 GTAT3AGG3G GCCACTGTAA TAAGCCCCTC 
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CATC2AAGCG CACAACTTGT GCTACTCCA2 ACTGATACCC GGCGATTCGC TCCACCTGCA 
CCCACACCTC TCCCCGGACG ACTACGAAAC CTTTGTCCTC AGCGGAGGT2 CGGTCZACTT 
-S-AAAAAAA CACAAAAGGG AGTCCCTTCT TGCCAAG CTT CTGACGGTAT GGCTCGCGAA 12 22: 
GAGAAAAGAA ATAAGAAAGA CCCTGGCATC ATGCACGGAC CCCGCACTGA AAACTATTCT 12 2SC 
AGAGAAACAA CAACTGGCCA TCAAGGTTAC CTGCAACGCC GTTTACGGCT TCACGGG CGT 1344 0 
TGCCTCTGGC ATACTGCCTT GCCTAAACAT AGCGGAGACC GTGACACTAC AAGGGCGAAA 
GATGCTGGAG AGATCTCAGG CCTTTGTAGA GGCCATCTCG CCGGAACGCC TA3CGGGTCT 

„ ( ^CAATAGACG TCTCACCCGA CGCCCGATTC AAGGTCATAT ACGGCGACA2 12 6 2" 

TGACTCTCTT TTCATATGCT GCATGGGTTT CAACATGGAC AGCC-TGTCAG ACTTCGCGGA 126 = Z 
GGAG 2TAGCG TCAATCACCA CCAACACGCT GTTTCGTAGC CCCATCAAGC TGGAGGCTGA 12 74 0 

AAAGAT' AAGTGCCTT2 TGCTCCTGAC TAAAAAGAGA TACGTGGGGG TACTCAGTGA 13 8 CO 

CGACAAGGTT CTGATGAAGG GCGTAGACCT CATTAGGAAA ACAGCCT3TC GTTTTGTCCA 13 56C 
GGAAAAGAG Z AGTCAGGTCC TGGACCTCAT ACTGCGGGAG CCGAGCGTCA AGGCCGCGGC 
CAA „ rTT - iTT tcgggGCAGG CGACAGACTG GGTGTACAGG GAAGGGCTCC CAGAGGGGTT 
CGTCAAGATA ATTCAA.GTGC TCAACGCGAG CCACCGGGAA CTGTGCGAAC GCAGC3TACC 
A3 _, 3ACAAA CTGACGTTTA CCACCGAGCT AAGCCGCCCG CTGGCGGACT ACAAGACGCA 
; _„„ TCCC3 CACGTGACGG TGTAC2AAAA GCTACAAGCT AGACAGGAGG AGCTTCCACA 14 ISO 
GATA2ACGA2 AGAATCCCCT ACGTGTTCGT CGACG2CCCA GGTAGCCTGC GCTCCGAGCT 1422 0 
GGCAGAGCAC CCCGAGTACG TTAAGCAGCA CGGACTGCGC GTGGCGGTGG ACCTGTACTT 
CGACAAGCTG GTACACGCGG TAG C CAA GAT CATCCAATGC CTCTTCCAGA ACAACACGTC 
GGCAACC3TA GCTATGTTGT ATAACTTTTT AGAZATTCCC GTGACTTTTC CCACGCCCTA 14400 
GTGACTCAGA CGCGGAAACA GCC-CCTAGAA AGTTTCCTCT TGC3CTATGT GGGACAAGTA 144 6 0 
GAGTCCAACC TGGCAAGCAG TGGAG GAAGA CGCCAGACAG CCGATCTCGA AAAAAATAAT 
GCAGACAGAG GCAACGTTCA TCCTAGGTGA CTGG G AG A T A ACGGTGTCTA ACTGCCGGTT 
TACTTGCAGC AGCCTAACAT GTGGCCCCCT TTACAGATCT AGCGGCGACT ACACGCGGCT 
AAGAATCCCC TTCTCTCTGG ATCGACTAAT ACGTG AC CAT GCCATCTTTG GGCTAGTGCC 
- AAATATTGAG GATCTGTTAA CCCATGGGTC ATGCGTCGCC GTAGTGGCCG ACGCAAACGC 
CAZAGGCGGC AACGCGCGAC GCATCGTCGC GCCTGGCGTG ATAAACAATT TTTCAGAACC 
CATCGGCATT TGGGTACGCG GCCCTCCGCC GCAAACGCGC AAGGAAGCTA TTAAGTTCTG 14 660 
CA7A _. TTT cxcAGTCCCC TGCCCCCGCG GGAGATGAC2 ACATATGTGT TCAAGGGCGG 
CGATTTGCCT CCCGGAGCAG AGGAACCCGA AACACTACAC TCCGC2GAGG CACCCCTACC 
GTCGCGCGAG ACG CTGGTAA CTGGACAGCT GCGATCCACC TCGCCGCGAA CGTATACGGG 
ATACTTTCAC AGTCCTGTCC CGCTCTCTTT TTTGGACCTC CTGACATTCG AGTCCATTGG 
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3 7=73acaa3 37g=aa3G7G ^r=== 3M =A attgaca^ ^"-^ 
gacgsgagaa agattttgca aa37aa=737 ttac c 

.,.3--C3T3" C3T7TC3777 ^KC=K G = =3 TCC 3 = = CG7GA37773 7CA7 3 G37ZA 
33377^3=7 373A7AA3AA C33G777333 A333AG337A 773G=A37r7 ATCrACACT:- 
-3A3AAAACT A7333A3373 AG3AAACCAC CACCC73AGG A77CAA773: 7377CGAG.^. 
G3A73G7GC3 AA3G33GGA3 A77GG3C777 T3TCATCATG GGG373GCCC G7GAAA3AAA 
G77737C73A TTTC-3GCAG 7A37C37733 G3G3AAGCAC GAACACC77A ^ GT ^=~ 

rr _ --— -T-TGA C3A7T3AAC3 GGACACAA7A G7GG33373G 3AATG3377C- 

~A7A7==AC C37GG7AAGG CA3CCAGCCA GGCACCA7AC AGC77C7A3G A3733AA33A 
IgAgL-TGG CACGT3GGG0 7C7TC3AGA7 CAAACG3GGA CCGGGA3GG3 737G7A3A33 
AC--3GCAC G7A3CGA7TA G3G3CGACCG C3ACGAGGAA CGGA7GCAA7 CGTGAC737C 
CGAGCACA7A 7GGCGCAG3A G7CAGA3CAG TGCTCCC37G C077TGCA37 G73CAG7AG7 
AAACGA3AG3 73333GGCGG C3A3CG3G7G 7GGGA77CGG ' TCA773AC33 GAGCGACATr 
G7CA7C7C7A A73GA37ACC C37C77AC7A AGAGAACAGC A3A7A7G737 CC3T.CGT.- 
r-.-AGCGTCG G3GAGA7CC7 C3ACAGAGCC 7ACCCGAAC7 77ACA777GA CAACAZGGAT 
CGCAAGCAGC AAACGGAGAC C7ACAC7GCA TTCTAC3CTT 77GGGGAG3A 
G77AGGAT37 73CCCACTG7 7G733AAAGC 7C37C3AGCG 73C73A7777 T-G^.—-- 

„___ - ,-3-CGTGGGA GG3CTCAAAA 7AA7AA7AC7 7GC73.CA7C 

CT3GTGCATG CGCAAGGAG7 G7AC37GC37 7GCGG7AAG3 ACC777C7AC AC3ACA37GC 
GCACCGGC7A 7737TGAGCG 7GAGG7GC7G AGCAG3GGG7 77GAG3CGCA G777ACCG7A 
ACTG3CArr= CAGTGACATC C7CGAAC77A AACCAA7337 AC777G73" AA3AAA3CCA 

, , AGACSAC73A GG AGTGTCG Z 

AAAA33G33C 7GGCAAAGCC 3777 GC... ='— ul " 

G7CAG3TC7A TC3G3377G3 GAAGACA3A3 C7GCGGA7A7 CGGTGA37GC GG37G3G3A3 
GAAAC3C3CG 737GGGGGC7 3G7GAGCACG AGC77GAGC3 7TACCCG3A3 CGCACCG37G 
GC „_ T _ A - C GTAAC CCGTA CAATCAGGAG ACATTTGCCT G7AATGCGAA GCACTA3A7C 
CCAGTCA737 ACAGCGGACC AAAAA77ACG C7G3GCCGGG GCG3CGGC3A 
CAGAACAACA G37A3ACG7C C7CCG737CA 73GAAAG7CA CAG3CA7G37 
7GG7GTAAC7 G73ACATAT7 777AGAG3AC TCGGAA7G37 3GCGAAACAA G3CA3CACC3 
C7GAAAC7G3 7GAAGAC3A3 7GA7CA7CCC G7CA7A77GG AG33GGAGAC A3ACA7TG3A 
AACGCCC7C7 7GA7GATCGC AGG3AAGGGC CGAGG777AC GCAGAG7GAC 73GG77AACC 
ACAAAAACGA 77GAAC773C 7GGC3GGG7A AAGA7AGACA GCAGGAAA77 ACAAACA77C 
AGAAAAAT37 A7G77GCCAC GGGACGCA37 7A3G7G7CCG GT7CCCACCC ACAGA777\j . 
C7TTAT7G 37 77 C AAAT AAA AG3G7G77C7 G7CAACG7CC 7CCGG3C7CA C7AG7A7737 
G777GCA7AC GGGC3TGTCG "CCAGGA7C AACAC77CG7 CCC37A7CGA CC37AA7ACA 
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7AACACACAC AAA 3 AC AT AG TGAC7G7AGA CAG7TAA7C7 TTA77G7GTA GACACGCAA 
GTTA7AAGAA A77TTATGTC AGGTCGCTGT T 




'AGT7A7CG TGGA7G7CAG 
7C 7GGGA7AGAG TCCAAAACAC GCACCGC77G ACC7GCAAAC 7T77CCAT7G 
CA~TCAGAAG A7AAAACGAA GCAAAG7G7C TCACCCAA7A C77AAG7CCC 7GAAGGCTC2 
— AA-AGACC GCGG7CAAA7 77GGG7GGAC TGTAGTGCG7 C7TAGTC AG C 7TA7TGAGC7 

i r— —AT G7CCCATCC7 AAGG7C7TCG 7CAGAAGCTC CATGACG7CC ACG777ATCA 

r—-~^ TZZ AAA C7CCG7C GTTAAAAAC7 TAAACAACAC CTCGAATTCA AAAAAGCCA7 

-GGCGAGC77 77TAAGGCAG CTAG7C7CAT 7AAATCCTAT TAACCCGCAG T3A7CA37A7 
CGTT3A7GGC 7GGTAG77TC AGATGAAAAA 7AGCAGCGGG C7CTAGAA7A CCC77GCAGA 
7GCCGG7ACG G7AACAGAGG 7CGCGGAAGC ATTCATCGAT CACCCA7AGC A7CCAA7TGA 
G*~ GTCTG AA7 GAGAAGA7CC 7TTTCAAACT CGGGGGCGTC CGGCAACTTG CZCCGCGTTZ 
CAGA7ACCAG CAGTGAACCG AC7AGCAAGA GAGACCACAA C7TGAACCAG CACA73GC7G 
C7AACGC33G A7ACACTAGC CGG7GGTGCC CGAGCGGGAG T7ACGAAG7C TCACTGAAGG 
GCGGG37CGC GGGTCGGGGC CGCTCCAAA7 CAGGCAACGC CGTA7CCGAA CTC 

, -t^-j- GG7C7CAAAC A7G7AAAAGA 7ACCACG77C TTGAAAAACC 

CGCCAGGCT7 GGGG77CACG CGGGCATACG CAGCCAAGC7 ATCATGCGAG 
CACACGCAAA G7CATG7AAA ACCCGGGT7A AAAA7AGCCT AAC7GGCCAG GGGCCAGTGA 
GCGCCTCCCG GTACAAGTCC CCACCGCCGA TGACCCAAAC C7TGTCAA77 TGC7G7GG7A 
3 ^, CTGGGCT xctcgCCAAC CCAAGCGCGG CATCGAGCGA ACTCGCCAAA AAGTGAGCAC 
CAGGGGGCGC- GG7777TAAC GTGCGACT7A GAACCACATT GA7TCTACCC GCCAA7GG7C 
GACAGCCGGC GGGAATCGAA AGCCA7G7GC GCCGCCCCA7 AACAACGATC- 7T77GT7T7C 
CAGGGGCACA G7CGGTAGTC AGC7G7CGAA AACGCCTCAT GTCTCCCCGC AA7GCAGGCC 
ACGGGAGACA 7GTG777TTT CCGATCCCGA GTTTGGTATC AACCGCAAC7 AC A GAG T AAA 
GTGTAGGA7C CATGCGGGGA GGGTATAGG7 AAACACCACC AAGCACACAG TG7( 
7ATAC7T77A ATGAAACATA AGGGCAGACG AAACAGCCGA ACGT7TCC7A 
7GGAACCATA GCCACCCGCA GGCAAACCC7 GTGGAAGGA7 ATCAAG7AGA GAGGAGGGTC 
CAGCCTTA77 A7GG GAGGAG ACAC7A7AAG CCCGATCGCC CGACTGGGCA CCAACATAAC 
CGCCACAG7A AG7GGCCCTA TACGGCTCAG CGCCCAAG7T G7TACAGTCA CACCCAACCG 
CGGT7GGC7G 7ACA77GTCA 7CACGTCCA7 CATTATG7G7 7GG7TCTCCC GCT7CC77GT 
ACCCTGCAGC 7TCA7GCACG GA77C7TC7G AGTCGCGATG CACAGGAGCG CCA7CCGCGG 
GGCCA7C77G GTCGGG7GGA GCTGCCCCCG CGGGGCCA77 7TGGTCGGC7 GGAGC7GGCC 
CCGCGGGCCG CTCG7GG7CC TGG77ATCCG CACGGGGAAG AATT7GC7GA AGCTCGA7C7 
CCTC7ACTGC ACACTCTGG7 GATGTCGGCC GAGGTC7ATA 7GGAAACAC7 TCAACCCGCG 
7G77TACAGC AGCG7ATGCC CGCCCCACG7 GGCGCATCA7 G7GGAAAAAC GCACCCAACC 
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AAAAAACGTA G7AGGGGG37 GGAGGGACG- 

TA7A7AGC2A AACCGAGG7 2 3GAGG3GZAA 

33"A="=r —CAATGTC ATAATGAAAA TAAAAACAAT ZAGTTCCAGA C"7"T3=T 

AAGTCAS-C3 A=G=AATAG= GTCAT~CGC GCAAGGSTCG GCAGACCAG- CGCGTGTTG. 

„ „„ ----- G^G~ 7T3TAGAGA7 AG7GAGGCAG G7GG77AAA7 

A7A2GAC3G- ------ — - 

^-^--7-3 GACGTTCTGC- AGCTGTCGTG 7GCATCGACA GGCTGTAAA7 C7G7GA..7G 
CGAGC7CC72 G77GGAAA72 cagcagacag gaacatcctg atcttcgata TG37GAGAGA 

-AAAA"ATGG CATTAACCCC 7GCAACAAG7 GAC3G7ACGA GGG2AGG3G7 

CCAGGCAAG2 GGGG73GZ33 TCG7TGG737 ATACAA77GC A7GAG7AGC7 ACT337AA7G 
CTACAGCCAC 73AC7GTAGA AGCCGG7TAA C7GGGAGGCG ACGG7GGGG7 GG7A7GGGC2 
AAC73AAAGA GACGAGTCCA 77CCAAACAC T7A7GTAC77 TGTGGCTCGG 3777A77G7A 
AGAGCGAAGA 3G3GCGTTT3 7GG37CAGCT TTATTGTAAC AGCGAAGA3G GACG7A7G73 

33A773A7G7 AG AC AAC C C G CTC3CA3GAA 7733 377777 
Z^ZGZGZZ?,Z 7AAA7ACGCA 3A7GA3CG33 CA7337G777 
C3GC3773AC 33GCACCAC7 37GAGGG3AC GC7AAA3A7C GCC77A33TG C7A7AG7G32 
■AC GAA7GGTAGG A7GCGGGCAG 7A3T3CAC3A G7C7AAAAT2 — ~ - 
TG77C1A 73 3AAGAAA3 AGACCGGAG7 ATGTGCAGG3 GC3GAAAG3G A3373GAG73 
CGCG7CA3C7 GCAGC3GTAG 7GGC7C7ATA TGCGT7TTG7 AGA7G7GGGC A73733CAA3 
G7G7GAA7AA A37C3CGGGG 777AAGACCA G7AA3ATGA3 GAAGGA7A7A AG7TAAGAG3 
GAA7AGC7GG CAATGTTAAA AGGAACTCC3 AAACGCA7G7 CTC3C3AC3T G7GA7AGAG2 
TGACAGGAAA GC73AC3G7C AGC7ACA7AA AA7TGACA7A AGAAGTGA2A GG3333AAGC 
GCCATCAA3G AGAAGT7CGC CGGG77CCAC GCACACA7AA 7GA7737737 A7C373CGGA 
77A7T7T77A T7AAATCCA3 AA7G7ACGAC AA77GGTCAA ACCGG73332 TG7A7AGTCA 
GCATCCGC3T C3ACGTACG3 CGCCCCAAAG TGC37CCAC7 GGAAACCGTA AACAGGTC3C 
AAATCCCCCT CCCTTCTGTG CGCCAGGCCG CG3CCGGCCA GGAACTCC37 GGAGC3A7T7 
77GTCCCA7A T7TTGAG7CG 7G7TCTTGAA AGC7C3C7GG AG7CAG7AC7 333C773AGA 
AACCAAAGCA GCTCT7GCAC 7ACGCC7CG3 CAAAACA3C3 G377TGTG37 7AG.AAGGGA 
AAGTGG72Z3 GCAGACTATA CCTGGCC7GC ATGCCAAA7A GAGAGAGGG7 G=C.A.G==G 
G7G3GG733A G73GATCGC7 GCCACGGCAC AAAA7T7C Z 3 7CAAC7G3C7 GAGA7A37GA 
AGTTCCTCGT GGGG337C73 AGCCCCAGTT ACC7CA7GC7 GAATCGAACA AGGG73AACC 
T=G3G33==A AAGCCAAGA3 GCCAGGCTTT 7GACAGAAG3 GAAACC3337 GGCA333AA7 
AA C777773 3 CGACA7ACAA GCT7AAAGGT ACAAACGGAA ACA7GA7AGA 73373GAAG7 
7737GAAG3C 37GTGCCCGG AGAGACACCC CTCAAC73GC AGTGCTCG3A GAC37ACAT3 
7A7A37CAGG G7C7TCTATA AAC3C7CCCC AAAAG TT7 AT AAAACACCGT 
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GAAAACATCA CAAGXACAAA AAGTGTGTGT _ CTGACATTCA CA777AT7T7 TACAAGACAA Z-S€ Z 
T7TTGTGCAG 7AGAGTTG7G CCT7CCGACA CCCCGCGCCG 7TCGC7G777 7CCT2-7AA77 21-12: 
GGGAGATCCC AC7CCT7GGC AGGCACG7T7 CACGAAACGC 7C77G7C7CG C73GCC7TA2- 214S: 
AC7T37GGAC CCAACA7GGG TA7CGTTAGA GATCCGTCGC GTAAA7GCGC AG CTGG 2 AAA 215-iC 
GCATTr -. TCA G CG AG CAGTG AC7GG7AATT GCTGCATCAG CTT CTT C A 2 2 CAGTC7TTCG 2I6C-C 
ATTTGTCGGC A2A2ACCTGG CGACCACGCT 7TGTCAAAAA TATCACACCC GGCTTGCTG2 2I6£C 
ACA3 — GGGA 3GTGGGGTAC CAGCTGGACA GAAGCACCTG T3GTAA7GGT C7T77C7GG7 21722 
AACCGAGACA GCAC7TGTCC GGTCTATGC2 AGGACGCTCC CAGCG7G7C2 CCAGA77GCA 21-80 
AACAAAGCAA GGCAG7CAGC AC AG CG AC G A GCAGGA7GCC CTTGG7G7C2 ATAAC7CCC2 2184C 
7CGTG7GTCC 7CGTGTAAAT GCGAAACGGC GATG77AGG7 CAGGCGCGG7 AAACAGC7CA 
ACTCGGTTCA AAACACGTAC G7GA7GTAG7 GCTGG77CTA CGACGCC7AC CTGTAAAC77 
CAGGA7CC72- GGC7777A77 ACGAAGGCCA ACACCCCAAA AAA7CCACGC CC2CG7GACC 
GCAGGGGCGG 77AC7AACGA CGGTTACAGG TCCC7CCCGA GCCACGCAC2 TGCCATGTAA 2208 C 
CC7GCAAGG7 AACCAGACAA A CAT CTAGG A AG CG 7 AAAT A TCCCCAGG7A G G AG AAGTA7 2214 0 
TGCA7A7G7C ACAGAC7CAA CACACACGGG CCG77ACGCA ACGGC7AGGG GCA7AACCCT 
TTACCGGC2-C GAAGCGCTAC GCG CTTCGCG AGAGGTA7C7 CCGTGTGCTT C7CCATCAGA 
AGACGCGTGC GCCG277CGC AGGCGACCCG CATAC777CC GCCCCGAGTG CG7TACAAAA 
A7GAC7GCC7 TCTGGCGACA ATACACGG7G GACG7CCAG7 ACCACCCGCA TAT C AG CTT A 223 6 0 
TCCGGTGGCA ATCTGGCACT GGACAGGGAA. TTCTCGCAAC AATCCGAGG2 CATGA7GGTG 2244 0 
GCAGGACCGC TGGCCGCACA TAGC7CAA7C ACGGCCACCC AGAAG AG CAG CCCCAAATGT 2 250G 
GCG CG C AA C A CCCAGCACAT GCTCCACA7A CAG77CTGGC GCCACAACGA TGATGCGCAA 2256 0 
AGGGGTGCAT TACCCTAAAT CCCAGCCTAG 77ATAAATTA 77GAAGCCCA GGCGACCAGG 2 2620 
GGTCGCCGCG C7TTTCCTCC CCAAACG CGA CGATAAAGAC CAGCG7TGC7 AAA7GTAACT 
TATGTATAAC CC AAAATATT GCGCATCGAT AAGG77TG CC AAAACACCCG AAAGTACACA 
CACAAAAAAA CAGCAACAAG ACGC7CACTA GACATTCACC CCT7CCCCCA CCCCCGAAAA 22 3 00 
CAAAACAACT TGACACAGGG GAAACACCAG GGGCGGCGGA GGTTGTCAAT AGTGTCCAGT 2 2 86 0 
ATTTC3TTAG ACGCGGGTTC 7TGGACCCGA TG7CCCAGGT CAT7AAAGTC TCAAATGGGA 
T7AAAGGATC ATAG7TCCCA GG7T7AATAC TCCAAGCTAT CC C AG AAC AG GACCC2GGCA 
GAACCCC3C7 TAACAGCACC AAATCCAC77 GCGGTCCCAG AAAAGGTCGC CGAGGTGGCA 
AGG7GACTGA AAAGGTCA7A GAGAGGACAC CGGTCCCA77 7CCCACGGTC CAAAAATCCA 
GCGC2-CCCCA CCGGC77TCC GAGAACTTCG GCAAAGCTAA TTTGCATGCG CTAATCCTTT 
TATGTG C ATA AATTA7GTAG ATGAGGAG7 2 GCGCATGCGC AGAAAAATTC AGAGCGCCCG 
GG7GCACGGG GTCACC7CCA GGTCACGCCG CTAGGTGGGA CCGTGAGCGA CTCGAAAAAT 2 32 SO 
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CCCATCCGGC CGACCAATCC CGTTCGAGCT AGGCGACCGC 3 
AGCC3TCAAT CAAATTCGGA GGCCTCCCAT TG3CCCCTAT CCCTAGAACT CCCAA3CTGA = 2 5 = 
TTGGCC=A3A GCGGGAACCA ATCAGCGATT AG AG TTTTG T TTTGATTTTT CC7ATATATA = 3 5 = : 
TATATATAAT CCTTTAATCC TAGCGCAGC7 GAGTCATCG C AG C C C CT ATT C CAGTAGGTA = 364: 
7ACCCAGCTG GGTAATCCAG TAG G TAT AC C CAGGTGGGTG AACCCAGCTG GGTATACCCA ' = 3 " C 
GCT3CAATTC TATAATTAAA CAAGGTAGAA ACCAACGGGG TCCTCAG3TG 3TATTTCC3G 22-: 
AAGC - kTTA;:c AAAT AAG G C A ACCTCAGCTG GGAATACCAG CGGACTACCC CCAACTGTA7 = 
TCAACCCTCC TTTGTTTTCC G G AAG TAT AT CCATTTATGG AAAT GAG CT G G3TCACTCTA 22BBZ 
CTGGGTTATT CTTTATAATA GGGCCCGATG AGTCATGGGG TTGGGATTTT 7CTACTAGG7 =39,: 
CGTTTCGGTG GATGGGTGCC AG G ATT AT AG GGGCCCTGTC CACGGGGTTG T.CGGTGGCG = 4022 
GGGGGGGGGC TAGTGAGTCA CGGGCCTGGA ATCTCGCCTC TGGGTGGTTT CGGTAGATGG 2406 2 
GGGCCGGGAG GATGGGGCCC CGCCCACCGC TGGCGCGCCC C AGAACATG G GTGGCTAAC2- 2412 1 
CCTACATGGG CAGCTTGTCC TACGGTTACG CCCATTTGAG ACGGGTTAAC CAACTG.TA2 241E2 
ACCCCTTCGC CGGGAACGCT ATAAAAACGA GGGACAGCAG CCCCCCCTCG CGCACCGCG2 =4=4 C 
GCGCGGCGGC AC3TGGGACG GATCTCTTGG ATTTACCCG7 AACGA3GAGC CCCGGCAGCA =430, 
CCCCAGGAGC CCCGGCAGCA CCCCAGGAGC CCCGGCAGCA CCCCAGGAGC CCCGGCAGCA 2436C 
CCCCAGGAGC CCCGGCAGCA CCCCAGGAGC CCCGGCAGCA CCCCAGGAGC CCCGGCAGCA =442 2 
CCCCAGGAGC CCCGGCAGCA CCCCAGGAGC CCCGGCAGCA CCCCAGGAGC CCCGGCAGCA =44BC 
CCCCAGGAGC CCCGGCAGCA CCCCAGGAGC CCCGGCAGCA CCCCAGGAGC CCCGGCAGCA 
CCCCAGGAGC CCCGGCAGCA CCCCAGGAGC CCCGGCAGCA CCCCAGGAGC CCCGGCA3CA 
CCCCAGGAGC CCCGGCGCGC Z^ZZZTZZZZ GGAGGGGGAT CCCGGCGCGC CACCCTCCCC 
GGAGGGGGAT CCCGGCGCGC CACCCTCCCC GGAGGGGGAT CCCGGCGCGC CACCCTCCCC 
GGAGGGGGAT CCCGGCGCGC CACCCTCCCC GGAGGGGGAT CCCGGCGCGC CACCCTCCCC 
GGAGGGGGAT CCCGGCGCGC CACCCTCCCC GGAGGGGGAT CCCGGCGCGC CACCCTCCCC 
GGAGGGGGAT CCCGGCGCGC CACCCTCCCC GGAGGGGGAT CCCGGCGCGC CACCCTCCCC 
GGCAACAACC TGTTGCCATG TATGGCGATT TGTATCAGTC ACAAGCACAC AACCCCTGC7 
AGTATTAATG GTGTTTAAAA CGTTCTACAC GTACGGCGGA CCGCATCCGT CGCAAGCAC3 =50=2 
CGCATATAAC CCCCAAATGC ACCATGATGA G AAG C AC AG C 
AAACATCGTT ATCCAATATC ATTAAAAACC ACACCGAAAT TTACACAGGT 
CGTGTTAGTG TCACCCACTG TACACAAGGC GTGTCGTATA TGTAGTATAG GTATTTGAT3 
AGGCGGAAGC ATATCCCGCT TCCAGCGAAC GGAAATAAGA ATCATCCGTT CCAGCATTTA 

kGG GCACAGAGGA TT CACATTGT TTAGAGAGAG TTTTTCTTAG TCACCATTCC = 52 = 2 
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ATACTTGGGG AGTATTGGCG TACGATTTGG- GGGACGTTT3 AGGCTGGTCT ATTCTG33T Z 
c . crTTT;:c:: cGGCTATTCT GT 3 C GAG GAT AGGGTCTTGA AATAAACAAT GTTTAG3GA3 
-AAAAGGTTC CAGTGAG33T CATTTGTCGT TGCACCCATC CC 3C3TTTGC TTAAT3A33 2 
GAAAAGTA3A GGAGAGGGAT G G AAAAC AT A TGG3ACGCGG GTTGTTTGAA AGT2AAGAG3 
-A—-TGT-TT TAATGAGGAC A3ATTTGGGC ACAGGCCAGA GGGTAAAGCC CTAGGTGTG 3 
GCGGGGGGGG GGGTGTATAG GCTGCGAAAA CCTGCACGGT GGATAACACG CAGGGCGTGA 
CGTCACATAT CTGTGTGGAC CCAAGTGGTT GTTCAACCGT TGTTTTTTGG ATG ATTTTT Z 
CGCACC3GCT - tt - T3TGGG CGCGGATAGG TCGGTACGCG CTGTCCCCZT AAGTCCCGCA 
CGGTC3TTCG GGG3CCC3TC CGGGTCGTCT CCGGATGAAC CGTCACGTTC TTTGTGT3GA 
GAGGGGACGT GT3CTTCAGA TGACTGGTCC GTGGGCTCCT CGTCCGTCGC GCCCGGGGGT 
CCGAGAAGGA CCGTCAATTC GATGTTATCT TCGTTCGCGG TTGGCCGGCG CGGCCGTCGG 
TATGGGAGTA CGGTGACCGG GGTGTTATTT GCCGCGTATA ATGCCCTCAC AGTGGCACTT 
ACGCGGCATA 73CC3C3AAA TGGAAA3ACA ATAAATATTT GGTAAAACCC AAAGAAGCAG 
AG AAAAC 3 3 A GGACGGGCGC GGGGGAGAAT GTTCCCGGAG GAGCAGTTAG GATGAGCAGG 
AGCGTCCAGG TGGAGAACGC CACGCCGACA AGCCCAGCCA CCACGAGAGA CATGAGCAGA 
AACAGTTCAA AAATTTCTTG GCGCTCGAT3 TGCGGCCAGA GGTTAAGGGG ACTACGCCAC 
TGCGTGCGCG TGCGGTATAT AACGCGACAC ATTTGAGAGG CCGTGTTTCG AGACAGTGTT 
AGCGAAGTGG TTAAAGAGTG CGGG7GGACG ACATCGAGGT CTCCGGTACA GGCGCAGGGG 

TCTTGTACGT CCTTAGAGGC CATGTGTGCA GGTGGGGTGG AAGTGTGAAA AAGGGAAAGG 
GGAGGTGAGC AGAGTGCGCA GTTAGTGTCG GACCCGCCGT CCGCCGTACT GTCGCTATCG 
CGCCTTGACA GAT3TCTAAG GTATTCACGG ACGCCACATG TGTGTCTATT TTCGTAGATG 
CAGGCTTTCC CTGGAAAAGT GTCACAACCC ACCCTGC7TT AGCTCTACAT CTGT ATTTTT 
GTTTACGCAC AGGATGAACG CTTCGTGCCC GTGCAGCCCC GCGGT3TCGG CCTG7G7TTG 
GAGGTTTTAT GAGTGGTTAG TTGTAGGCAG CTCCGGACAA GTTGTCCAAA ACACGGCGGG 
CCCCGCCCTT CC'TZCCZCC GGATGCGCGG ACACCGGAGG TATGAAATAA GGGACAGGCG 
T CAT 3 ACT AG TTATGAGAGA AAAAC 3A3AA CAGCTTTATT GGAAAACACC TGAGTGGATG 
CCCCACCCC3 CGCGTACGAC AGGCGTTTCT GTGGTGCGCT TCTGGGAAAA ACGTTTTTCC 
CCCATTTCTT CC7CGAGAGG TCTTCTAAGG TAGATAAATC CGCCCCC77T GCGC3TCTCC 
TAGAATGGCC TAGGCGCACG ATGGCGTTGT CGGCTCGAGC AGTTGGGCCG CAGTGATATG 
TTCAACTTTC GACCGTCTAA GCTATGGCAG GCAGCCGCTG CATCAGCTGC CTAACCCAGT 
TTTTGGAAGG G7C7GC3CAG ATCTGACGCC C7CGC7TGG7 CAG 3 AAAAT A ACT 3CGGGTT 
TTGGGCACGC TGGGGACGTG GGATACCACT CTTTTAGAAT TTGGACGGGC GGTGGGTGCT 
GGTGGAACC C GTAGCAGCAG CTATTAGGCG TGTACGACAC GAGTGACCCC GCGCTTTCTG 
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TAAAGTGGTG TGCGGCAGCT GGGAGCGCTC TTTCAATGTT AATGTTTTAA TGTGTAT3TT ZSi*: 
GTGTTGGAAC- TTCCAG3CTA ATATTT GATG TTTTGCTAG3 7TGAC7AACG ATGTTTTCTT Z? = = . 
3TAGGTGAAA GCGTTG7GTA ACAATGATAA CGGTGTTTTG GCTGG3TTTT TCCTTGTTCG 
CACCGGACAC CTCCAG7GAC CAGACGGCAA GGTTTTTATC CCAGTGTAT; 
ATGTTATArr TTTGACAATT T AACGTG C CT AG AG CT C AAA TTAAACTAAT AC 
AATGCAACTT ACAACATAAA TAAAGGTCAA TGTTTAAT C C ATATTTCCTG ACTTGTGTCT 
TGACTTGCGT CGATTGGGA7 GGGGGTGTGG GATGGGGGTG TGGGATGGGG GTGTGGGATG 
GGGGTGTGGG ATGGGGGTGT GGGATGGGGG TG7GGGATGG GGGTG7GGGA T3GGGGT3TG 
GGATG33GGT 3TGGGATGGG GGTGTGG3AT GGGGGTGTGG GATGGGGGTG TGGGATGGGG 
GTAAATGACA ATGGGGGTAA ATGACAATGG GGCGCTTGGT GACACAT7TG CCCCACCGTC 
GCC7GCCC3G AACCAGCTTG GTGATGTGCT GTCTGGCTCT caggtgcact TTATGCAAAG 
CAGTTGAGGC G C ATT AG ATA TATAAAACTT GGGTACACAC CCTTGGTGCT GTGCGCGTGC 
TAT3TGCCCT GGTGACC3TC CACAATGGAC GAGGACGTTT TGCCTGGAGA GGTGTTGGCC 
ATTGAAGGGA TATTCATGG C CTGTGGATTA AACGAACCTG AGTACCT3TA CCATCCTTTG 
CTCA3CCC7A TTAAGC7ATA CATCACAGGC TT AAT G Z GAG ACAAGGAGTC TTTATTCGAG 
GCCATGT733 CTAA7GTGAG ATTTCACAGC ACCACCGGTA TAAACCAGCT TGGGTTGAGC 
A73CT3CAGC- T7AGCGGCGA TGGAAACATG AACTGGGGGC GAGCCCTGGC TATACTGACC 
TTTGGCA3T7 TTGT3GCCCA GAAGTTATCC AACGAACCTC ACCTGCGAGA CTTTG CTTTG 
GCCGTTTTAC CTGTATATGC GTATGAAGCA ATCGGACCCC AG7GGT7TCS CGCTCGCGGA 
GGCTGGCGAG GCCTGAAGGC GTA7T3TACA CAGGTGCTTA CCAGAAGAAG GGGACGGAGA 
ATGACAGCGC TATTGGGAAG CATTGCATTA TTGGCCACTA TATTGGCAGC GGTCGCGATG 
AGCAGGAGAT AACGCGTAAT TCGAGGTCCC CGGAAGAGTA GAGGGTTGCA TGTTATACAA 
ACAACATAAA C ATT AAATG A ACATTGTTCA AAACGTATGT TTATTTTTTT TCAAACAGGG 
GAGTAGGGTA GGAAGGGTAC GTCTAATACG TAACTGTTCG CTACTGCTTG TTCAGGAGCT 
CCTCGCAGAA CATCTTGCGA ATTTTAGATT TTGGACTAGA GCGACTGCTG GCTTCAACGC 
GGTTCGATGT AGGGTTCGGC GTAGGAGCGT CTTTCTCCAC CGCCGCGCA7 GGTGTATGCG 
TGGTCTCCGG TGCCTGTTGT TGGATGCTCT GCGT3CTGGA GGCGGGGGTG GGTTCAGCGG 
GTG3T3CGCC AACTACCGCG AGTCCTGTAG AGACTGGCGG GTGGCTCACA TGTGGC7GAG 
CAAAAAGGAT GGGCGCCGCT 7GCTGGAACT GACCGTGTGG CGCCTGCACG TAAATGGG7G 
GG7GTACGTA GGTTCCTCCG TGCTCCTTCA 77G7CGGGAA TT3ACACGGG ACCGCTGAAT 
TGGCGT3GGG CCTGTAGTGT GGATCTACTG CGGCTGCTGC TGCAGAGGAG GACGGCGGTG 
GCCCTGCGTG CCAACCGTTC AGTT7CA7CT CTTTGAGTTC AGACTGTATT TCCGCTATG7 
TCTTTGACAT GGACAAGATA TCCTTGTGAT ACGCCGGCTC C7CTCCTGGA AAGAGGTGTC 
„ CGTCGTC CTCT3CGC CG CGCTTGCGCT TCCCCGTCCT ATATCCAGGC AGCTGTGGCG 
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3CCC3GG377 TA33A33C7C A73TGGCGG7 77773C 



A~-"AAGA G3733G777T A77GA777GG 77G737GCAG GA737ACAAT 77C3G737AA 
,_ 3T , TACCT GTTAAGGM G G37ACTGCGA A7G3C3GGA7 C7AC3AC3A3 G7GG733TG3 
GACGCAAGG7 7377GCGGAG G7G7GGAAG3 73GTG7AGGA 7GG3C7C3AG GAGA73G3CG 
-GTCAAG7GA GA7GC73C7G 73TGAGGGA7 ACG3GGAGAG C77373GAT3 3AC..GAAC3 
ATAAGG73G3 G3TC77GAGG GGC37GGCGA A77A737G77 7CAC3GGC7A GGG37CACCC 
A-GA3G77C3 CATCGCCCCG GAAAACCTGG 7GGACG3AAA 377777G777 AA7C7GGGAA 
C73CAGGC7C- 37C377GCGG CGGGG7ACTG 3CTGGCC777 7GGGGCAGCG 



GTGT 



ATGAACACGA ACG 3TGGG7G C3C77C7T3G C CCAGAAG 37 777CA7TTGC TAC37GATAG 
TGT7A7GCGA GAGAGG737C 7GC7AG777C- GGCCAGCGAA 
G3AGGGAG7C 7373GCGACA 7CCG37GGA7 37ACGGCA7A 
C337C7CGGG 77A7C77333 377C3G7CC3 AA3C3CAGC7 GGCCTAC^ 

"AACAACGC G3777AAA33 AC33CGAG3A CCACCG3GA3 3CA3CGAAGA ACCA7AAAG7 

ACS'""" T A7 3 GTAG7CA7C3 C=3CCG==AA AC73G3AC77 GATAA7C7CC 7G3AGAAGGG 

73GGTGGGGA 7G3G7373AA AGCAGGACGT CGAGGCCC7G 773737733C AGGG33AGG3 

,^, TZG: C7GGAGCAGC G3CA37GGA7 CTCG3AA7GT AAGC7GC7GG 773AGGA. . . 
CGAATA7C7C A77AAACC7A 37GCC737CA GA777ACAAA 7G37CCGGG7 7G777G7GGG 



2 1=-.' 

: . ~ i : 



AAG-G-A-- — 37A733733C GG7333777-- 

^ ;r -_ T33A73G7A7 33377 

A337AGG7GA 7337CG37G3 3GAACA7C73 GC33C3A3A3 "3A77A3G: 

77T37AG3A7 337CAGGAAG GCGGA777GG GGATG3A7A7 GA7A73G7G7 7GACGA3AG: 
r3 G7"3GG7 GG737GAC33 37TGGA73C3 GA3333GAA7 73G3.G3"7 
~IaA-ACGC CGGG37CAA7 ATG=TGGKA CACCT77G7C AGT777=AA7 AGG73GAGG3 
1„„, 3 ~- GAAG77GGCA 737A7A3C77 77GCCA77AA G3737GCAGG GGAG73A3GA 
A.A777337G7 3GAAAG3TCC 733A30C7G3 A3GTAC77AC G7G37GGAGG A7G7G33CG3 
33TC3GA377 A3A7AC7GA7 3AGAA7CTGG AAACCAG73A 3733G3G7C3 7373=3. ACA 
3G3C3AC737 G==G=37CGG C33CGCAGG3 CGCA7A373A 7AC37377GA AACACGG3AC 
CGC7GGGA37 37C-3C-A7AAC 7G3CGG33A7 GTA7AGA3GA 7AAAGACA33 C73G3GAGC3 

._ Tr , TAGGGAGA77 TT73AGGG3G G773733CCA 

A337373GAC- 7A7C7C=«Aw „^TGu 

737333A3G7 37333CCAGC 773GA7G3CA G37C7AGGAA G3C73GCGAC 
CG3T3CAGAA AATACC37GG 

37GG7AG373 7733A 

.- - -. rj. G37TGGS3CA GGACACAACA 737A3AAAC3 

GATCCGGA7C C-AS«i~»»u-- « 

3AGGCC37GT G33A7G37C:3 3AAAA7ACG7 3737GAGACC GAG333GTGA 
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.CTTGAA A7AGACCCAG TGTCCAGC2C ACTT2TG72T 
TT ATTGGAAGGG GTTCTGTGAC TGGGAGATAA TCCGTCACCT 2 2 340 
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TAAGAATTTA AAT ACATT Z Z 3 2 520 
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ACACC 

GGGCGTTTG3 ATTAGCCTGC AGTGTGGGGA TTATGTAGTG CTCCGATATG AACC 
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_ r ^-^.-AGCATG GGGGC3TTTG GATCCGGTAG GGCACCGGG2 TGAAATT.3G 

GTGGGAG GAG GGATACCAGG TTCAAGCGGC GGTTTGGGTG CCCTZGCGZG ACTTGCCCAA 

ACT G "AG C AA TCCATACGCG AG GAT AAA CA. CCTCCAGCGC AACAATCCC2 GCTCGCAGG7 

TCCACTGGTA TGCGGAAAAT GGTGGTATAT CGGACCCAAA CATGGCGCTG GTAATGGCGA 

ATAcrAAGTC CATGGCGGGG GCTGTCCCTG GCGCGCCCGT ACCCTTGTTG TGGGGAAATA 

_ AGC: — - AGC: - AT=ATT GCGTGAAGGT TGTGGCGCTG GAAGAAGGCT GTCGGATAGC 33 96 2 

3G _ CTcr -. T ATTGAGAGGC GCCAGCGAGG CGCGCTCCTG GGGGTTTGAG TATGTGAAGC 34 02.0 

TGAAGTCCC2 AGGACC3CTT TCCTGTTTTA G CTGAGTG AT TAGCAGGTCT AG CTTTTGAG 34 CSC 

GCAGGTCTGC TAACAGGTCA TGGGGAGTAG CGGGCAGTTG CCTGGATGTC TTTTGACAAA 3414 C 

AGTACGCGTT GACGAGGCAA AGCGCGGCCT GGGTGTCCGT GAGATGCCTG GCGTCGGCGA 3 4200 

AAAAGTCAGC GGTGGTCGAG GCGAGGGTCG TCAGGGTGTG AGAGATGAGT TTGAGCGATG 34 26 0 

^ AAAG— AAGA GTCCCCTTTA GTTCTTTAGG GAAGACG CGC CGCTGCATGG 34320 

CGTTGTCC3T GAGGCT3ATG AACCACGGCC CAAAGGATGG CAACCACTGA TTCTGGTTCA 343 6 0 

TGTACAGGGT GGGCATGAGC TCGCCGCGCA GGTCCCTGTC AACGGAGAAG TGAGGGTCCC 3 4440 

GGGGGACGAT CGCCACGGTG AAGTTACGGT GGCTGGCCTG CGGGGGGGAT GTCACTAAGG 

GAGGCTCAT3 G3AACG3CTT TGGGGCATGT CTATGTTGTC AGACCATGTC ATGTTGCCTA 

TCATCTGTTT GACCGCGTCG ATATCTGCGT TAATGACGCG GACG CGTGAG TCATGGACCT 

GAACAAGCCG GTCCAGCTCT AGGGAAAGGA GGTGTGCCTT TGTCTTTCGT TCTCGATTTC 345 6 0 

GCACGAGTTG GCTGCGCAGT CCAAGGGCGA CCGTTCTTGT TTCTTCCATG GTGGGCTTGT 3 4 74 C 

GAATAAACAG CACGTTTTCC GGGTGTGGGG CCCAGAATCT TCCCGCCTCT GTCCATCTTC 34 8 CO 

GGTTTTTTGG GTACGTTAGA TAGGACCTTT CTGATGTCAG CATTTTCTCT AGCAGTGAGA 

AAGGCGCACA ATTTTCCTTC GGTGGTGTGG ACCGGCGTGG GAAACGCCCC GGGTGATTCA 

GAGTATACTG TGTTTAGTGT TTTCTGATTC TTAAATATCA GCAGGGG CGT GATAGTCCAC 34 980 

GCCTCGC-TAC CCGGAGGGGC CGAGTGAGCG ATGTAATGGA TCGAGTCGGA GAGTTGGCAC 

AGG CCTTGAG CTCGCTGTGA CGTTCTCACG GTGTTGGTTG GGATCAGCTG GTG ACT GAGA 

\2) INFORMATION FOR SEQ ID NO: 16: 

!i> SEQUENCE CHARACTERISTICS: 

; A) LENGTH: 3 5100 base pairs 
(3) TYPE: nucleic acid 
(C) STRAKDEDNESS : double 
;Dt TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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GTAACATACG GGCTGATGCG 3AGG3GATA3 3AGAAT.AGG 



GAAGT3--3A GCT- 

CAGTGGGGAA TTCT3TGCCC TAGAGTCACG TGAAAGAATA ATCTGTGG73 7CCA 

, , -./-ii"^: ■~ATAGA7CGG GCAGGGTGGA GTACTTGAGG 15 

GGGTTCTGGG GC-jj-«-- 

MCCKSr AG3TGGCCAG GTGGGCCCGG TTACSTSCTC TTTTGCGTGC - G "= G ~ 3; 

CTG CTCAGGG ATTTCTTAAC =TCGGC=T=G GTTGGACGTA C=AT3G=AGA AGGCGG.." 



atgaag: 



AAGTTTTCGC TCCCCTTCCG GACGAACGCC ACCGCTATCC TGCGAATGAT GCAGCCTGG 
AACGTTG3GG GTGGGTCTGG GAGGGGCACT CAGTGGTGCG TC7TTGATAG GCATCTCCTC 
TC==CAG=M TGGTGTTCCC TCTCATGCAC CTGAAGCACG GCGGCCTATC TTTTGATCAC 
-T^C^T TACTTTCCAT CTTTAG AG C C AGAGAAGGCG ACGTGGTCG " CATTCTCACC 
CTCTCGAGCG CCGAGTCGTT GGGGCGGGTC AGGGCGAGGG GAAGAAAGAA CGACGGGACG 
3TGGAGCAAA A-TACATCAG AGAATTGGCG TGGGCTTATC ACGGCGTGTA CTGTTCATGG 
AT C ATG TTG Z AGTACATCAC TGTGGAGCAG ATGGTACAAC TATGCGTACA AACCACAAAT 
ATTCCG3AAA TCTGCTTCCG CAGCGTGCGC CTGGCACACA AGGAGGAAAC TTTGAAAAAC 
CTTGACGAGC AGAGCATGCT ACCTATGATC ACCGGTGTAC TGGATCGCGT GAGACATCAT 
CCCG7CGT3A TCGAGCTTTG C7TTTGTTT C TTCACAGAGC TGAGAAAATT ACAATT 
GTAGCCGACG CGGATAAGTT CCACGACGAC GTATGCGGCC TGTGGACCGA AAT~ACAGG 
_._._ r _ T ==AaT c=GGC TATTAAACGC AGGGCCATCA ACTGGCCAGC ATTA3AGAGC 



3cr 
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GGAGCGGA7T CGGTGGGGCG CGGCGGAGAA AAGGCCTCTG TGAC7AGGGG AGGGAGGTGG 
GACTTGGGGA GC7CGGACGA CGAATCAAGC ACCTCCAGAA CGAGCACGGA TATGGACGAG 
CTCCCTGAGG AGAGG AAAC Z A77AACGGGA AAGTCTGTAA AAACCTCGTA CATA-AGGAG 
GTGCCCACCG TC3CGACTAG CAAGCCGTGG CATTTAATGC ACGACAACTC GCTGTACGGA 54 : 

ACGC7TAGGT TTCCGCCCAG ACC7C7CA7A CGGCACCCTT CCGAAAAAGG CAGCA77777 
GCCAGTCGG7 7G7CAGCGAC 7GACGACGAC 7CGGGAGA77 AGGCGCCAA7 GGATGG3TTG 
3r . c - Tc:; - iC .. G77GGAGGG7 GTGTGGTCGG CCTCCCCTTG CGCCTCCAAA 

CCGGCAAGTA GGCCGGCAGA CGGG73AATG GGGGACG7GG GCTGGGCGGA CCTGGAGGGA 7 S = 

_ rrJ _ — AAAGGG A77777AAAA ACATCTACGA AGGGGGGCAG TCTCAAAGCC 640 

CGTGGAGG3G ATGTAGGTGA CC3TCTCAGG GACGGCGGGT TTG C CTTTAG T C TTAGGGG Z 900 
GTG AAA7 TTG CGATAGGGCA AAACATTAAA 7CA7GG7TGG GGATCGGAGA AT3ATCGGCG 

~— -GGTCACCAC GCAGCTTATG GTACCGGTGC AC 77 C ATT AG AACGCCTGTG 

ACCGTGGAC7 ACAGGAATGT TTATTTGCTT TACTT AGAGG GGGTAA7GGG TGTGGGCAAA 
TCAAC3C7GG TCAACGCCGT G7GCGGGATC TTGCCCCAGG AGAGAG - GAC AAGTTTTCCC 
GAGCCCA73G TGTACTGGAC GAGGGCA777 ACAGATTGTT ACAAGGAAAT 

„ GTAAG3C GGG AGACCCGCTG ACGTCTGCCA AAATA7A7TC A7G Z C AAAA C -26 0 
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CAG7C7AAAG CAG77AA7CA CC7AGAGGAG ACATGCAGGG 7C7AGCCT7C 77GG7GGC7Z Z : 4 : 

77GCA7GC7G GCGATGCATA 7CG77GACA7 GTGGAGCCAZ TGGCGCG77G CCGACAACGG Z 1 C ■ 

CGACGACAA7 AACCCGC7CC GCCACGCAGC T CAT C AATGG GAGAACCAAZ CT77CCA7AG ZIc" 

-J—33AATT CAACGGCAC7 AGTTTTTTTC 7 AAAT7G G C A AAATCTGTTG AATGTGATZA ZZZ' 

CC-GAGCCGGC CCTGACAGAG T7G7GGACC7 CCGCCGAAGT CGCCGAGGAC C7 CAGGG7AA ZZSC 

, GAAAAA GAGGCAAAG7 C7T77777CC CCAACAAGAC AG77G7GA7C 7C7GGAGACG Z 3 -i : 

GCCATCGC7A 7ACGTGCGAG GTGCCGACGT CGTCGCAAAZ 77A7AACA7C ACCAAGGGC7 24 CC 

~7AAC7A7AG ZGZ7C7GC7C GGGCACC7TG GCGGA77TGG GATCAACGCG CG7C7GGTAC 24 6 C- 

7GGG7GA7A7 C77CGCA7CA AAATGGTCGC 7 ATT C G CG AG GGACACCCCA GAG7A7CGGG Z5ZC 

TGT777ACCC AATGA7TGTC ATGGCCGTCA AGTTT7CCA7 A7CCA77GGC AACAACGAG7 

CCGGCGTAGC GCTCTATGGA G7GG7GTCGG AAGATTT CG 7 GG7CG7CACG C7CCACAACA 

GG7CCAAAGA GGZ7AACGAG ACGGCG7CCC A7C7TCTGTT CGG7C7CCCG GA77CAC7GC 

CA7Z7CTGAA GGGCCATGCC ACC7ATGATG AACTCACGTT CGCCCGAAAC GCAAAA7A7G Z~eZ 

CG CTAGTGGC GA7CC7GCC7 AAAGAT7CTT ACCAGACACT CC7TACAGAG AAT7ACAC7C 2SZC 

GCA7A777C7 GAACATGACG GAGTCGACGC CCC7CGAG77 CACGCGGACG ATCCAGACTA 

GGATCGTATZ AA7CGAGGCC AGGCGCGCCT GCGCAGCTCA AGAGGCGGCG C C G G A CAT A 7 

TCTTGGTGTT G777CAGA7G 7TGG7GGCAC AC777C77G7 7GCGCGGGGC AT7ACCGAGC 

ACCGA777G7 GGAGGTGGAC 7GCG7G7GTC GGCAGTA7GC GGAAC7G7A7 T77C7CCGCC 

3CA7C7CGCG TC7GTGCA7G CCCACG77CA CCACTGTCGG G7A7AACCAC ACCACCC7TG 312C 

GCGC7GTGGC CGCCACACAA A7AG C7CGCG TGTCCGCCAC GAAG77GGCC AG777GCCCC 
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GC7C7TCCCA GGAAACAGTG CTGGCCATGG 7CCAGC7TGG CGCCCGTGA7 GGCGCCGTCC 

CZTz::T CCAZ 7CTGGAGGGZ A77GC7ATGG TCGTCGAACA TA7GTA7ACC GCC7ACAC77 

ATGTGTACAC AC7CGGCGAT AC7GAAAGAA AA77AATG77 GGACA7ACAC ACGGTCCTCA 

CCGACAGC7G CCCGCCCAAA GAC7CCGGAG TATCAGAAAA GCTACTGAGA ACATAT7TGA 

7G77CACA7C AA7GTGT AC C AACA7AGAGC 7GGGCGAAAT GA7CG CCCGC 7TT7CCAAAC 348C 

CGGACAGCC7 7AACATCTAT AGGGCA77CT CCCCC7GC77 7 C7 AG G ACT A AGG7ACGA77 3 54 0 

7GCA7CCAGC CAAGT7GCGC 3CCGAGGCGC CGCAG7CG7C CG CTCTG ACG CGGACTGCCG 36 00 

77GCCAGAGG AACA7CGGGA 77CGCAGAAT TGCTCCACGC GCTGCACCTC GA7AGCTTAA 

A777AA77CC GGCGATTAAC TGT7CAAAGA TTACAGCCGA CAAGATAA7A GCTACGGTAC 

CC77GCC7CA CGTCACGTA7 AT CATC AG 77 CCGAAG CACT C7CGAACGC7 G77G7C7ACG 

AGGTGTCGGA GATCT7CC7C AAGAGTGCCA TGT7T AT AT C TG CT AT C AAA CCCGATTGCT 3 84 1 

CCGGCTTTAA C7777C7CAG A7TGA7AGGC AC ATT C C CAT AG 7 CT AC AAC ATCAGCACAC 

CAAGAAGAGG TTGCCCCC77 7GTGAC7C7G 7AA7CATGAG C7ACGA7GAG AGCGATGGCC 
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pcrrs97/ow-j: 



^CACA 77GA.T7A7T7 G73GG7GAGG GAGAAGGGjA 

VTAAGGGGG A7GTATAGAA GAGGCGGAGC GAG7GG777G 77TG7AA77r 

7gtg77ttat tgggttgtgg ggggttatct agtttg77Ta gagagtgtt7 tgg~ 

ATTA3AGGG7 GAA7AAA3GG TAGATTTTTA AAAGGTTT Z C TGTGGA77G7 TT77G7A7G2- 

G _„ TAC77 GGC AAGAAA7 GGGAGGACCT GAGAAAGTGG A77GGGG7GA GATATGAGT7 

ZZZZ TGGAGGTAGC CATGCGGCGC T7TGACGGTG T77GGGGG7A GAGATGATAA 

ZCT TGGAACAATC 7GGGGGTTGG CGAATGGGT7 



AG7AG7' 



7 gag: 



CATGGGTTG7 A" 

GAAA77G7G7 A7GG7ATTCA GGGAGAAGAG GGCGTGG7G7 AGGGGAGG77 
7AGGAGAGCG 2G3AAGAAG7 CCCGCTCGTG TGTTTTGGGA GGGGGAAGT7 
G7GGGGGG7A GAG2GATGAG AAAGAGGAGA CGA7G77TTC GAGCGGGATG C7GCGCAGGA 
ACACG7G777 GAGGAAGAGG 7GTTGTAGCC GGTTCAGTT7 TAGGTTGGGT AGAAAAG77A 
TGGAGTTGT7 AGCACGCTGC A7GA7GGTAA CGGTG77GAA GTGAGAGAGG GGGGT777T7 
GGAGTCT2GG GGG7G7GAG7 C 7 AA7 2 A7G7 AGAACA7AGA C3GGGGG7G3 7TG77TGTG7 
— -G7GA-A3 GA7A7GCCG7 7G37AAAGG7 G7GCGA7GT7 G7G777CAG7 A7AGA7GTGG 
77TGAGGG3G AGGGGGTGTT ATGGGG7GAG G2GGTAAAGG CGAC7CTGGG TGAAAGAGG7 
TTA7GGGG77 GGCGGG77CG 7GGA7GAGGA GACGG77GTT CGGGGCG7G7 A7GGGGACGC 

^ AGA A7AATC7 TAAAG77GG7 ATAAGAGTGG TGGGT2GTTA 

TGGGGAGGGG GGAC77CGGT AG7ATGTGGG TGTGGTGGAA TTGGTGGGGG GGTAGGAG7G 
3C77GGAGTG C AGG 7 AAA GG Z 2AAGAG ATG C3GTC7C7TC GGGTAGGGA2 AAG7GGG777 
7TAAGGGGTA GGGGTGGGGT G AG AG C ATG A TGGG7AGGAA CGA7AG7TC7 GGG7G777AG 
GGGGGTAGAG 7GGCAGGG7A GAGGAGTGGG GAG T G 2 G AAA GTTTTGGAAG AACAG7GGCA 
TGGG3AG777 AGGA77AGAG A77GGGAGGA 7GGCCGCCAC CGCGGGAGAG GTGAAGAGG7 
GAAAC AGG ZG C7GGGG7G7C GAGAGGG3GG GGG2GGGCTG 7 A GT AG AGTA G2G777AGG7 
CCGGAAG7CG 7 AAC AT AG C7 7AGAGGAGCG GACGGACGCA AGG7ACGTGG GGATCGGC7C- 
GCGGTGTCTG CTCG77GGAC GGGGGGGTTG GGTGGCGCGA GTGGAGGCG7 AG777GGGAA 
7GGCGTGACG GAGAA77TGT GGG7TTAGAG CGGGGAACGG A7GAGCCG7G GTGGCGACGA 
ACGAAATGAA GTT7GCAT7G CGG7CCAAC7 CGTCTAGCCT GG7G77CTTG '77TGGGGGA7 
AGAT7T7GGG GA77AGG77A G A GTTTTTAT A7CCGAG7AC TGCGCAC7GG 7GT-7GCTTT 
TAG7G7GAC7 G ATT AT GTTG 777GAGAAG7 C AAAG AGG Z Z CGGGGGGGCG GGTGGGGTAA 
7GCAAGCCAC GT G AAG Z GTG AG AAA G G AA C AG G ATT C GAG CAGAGAGTCG AGGAAG 777 . 
7GTG7AGGG7 G7G7A77TGG GAACGGT77G TGTGGTGAAG TAGGGAGAA.7 A77G7A7777 
TGT77GGGTG GATG2GGGGG TGGTGG7GGG TGAGAATGGG CGGGAGGTGG 7G 3 G G AA7 Z 7 
G77CGACAAG AGGGTGGGGG 7AGAGTTTAG AAATGGTGGC TGTGGGGGGG TT AAA G G AGG 
ACAGGT77AG GGGA7CG7TG GTGGAGAGCA CAGATGGAAA GTTTGTGG7G GAAAA7AGG7 
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77TTT2GG32 3A7T3T3ACC AT3TACTGGT T77C3AGTC2 GTGZ^GZZZ AA3GTGGAG7 £:.:: 
7rr\AATTTG3 7A733ATACA GGAAATATGT G33TGAT7GG CAGAAAGGA7 77CA3337A3 6 2 S : 

C 3A7TGG GAA G AG AAAGTG C AGCATGTCG3 CA3TGATGTT GATGT7TAT7 3333733377 
GA3A3ATGT7 3T3G3AAAAA AACACG377A TGGTAAAAGA AG3TTC37T7 A3GGA27A37 
7T33TATAA3 AAAA773TT3 GTCAATCT3G GGATGTTTAA AA7A3T37T7 TGGAGGGTG7 
7AGGAA33T3 G GAG 377 AT C 77AG7G77AA T3A3CA7G77 G3TG77GAAT A7G373A737 6420 
TGAAGT77TC 3AAACTGA33 7G7T7TGTGG G7T 3 CAG CAT G7CTGACA37 37AGA337G3 64 ST 

33AGAGT333 333GT3CGTG G33G3GTAT3 377GGAAG3A C3 3 3TG 3 AAA. T7T3377T3A 6 54 0 

TGG3TG3TC3 333G7377T3 GZZGZZT^ZZ GGA7T3TTGA AAG337C33C G33AGGAGA3 
GCGGTGTCTC 37GGG7G33T AAAAAGTTTG 3GCAGGGGTG 3AGT33G3TG 3A3GAGTGG3 
3GATG3A3TC 7333AC7G33 ATACA3ATGA CGAGTCTGTA GATGGC3GGT GTG333GGA7 
A 3 ACT AG AT A 37AGG7A3AA T3TGGGGTA3 TGACGACCAC CCTGTATGG3 7TTG37333G 
G3T23TTG3G 773GA777TT AG3TGCAGAC GGGA3ACGAG CTG3TTTAGA 333AG3TGAA 
AG3C3AC3AG A733337333 77AA33TTGA 3GT33TGG7G CTTACTCTGT 7733A3AG37 
737T3A3CAC G37G33CA37 3GCT3TAC3T T3TGAG33A7 GGGACGGCGC AG 3 GAG A3 3 A 
G3T3T3C3T3 23A33333AC G73G3CATGA AG 37G 3TGA7 G7TAAACT7T AAAAAATG 7 A 
SCTGTGCGTC TGC-GGATGCG GGTGGCATTA 773AAAACGA GAGATG CTTG AGG37373CA 
33AG73CAAA A7AAT77TGA TAGATTGTGG GTTGTAGACT ATGGGGCAA3 A3 33 Z 3 AG AA 
AG 3 CAT 3 AAA A3ACTGTTCG AACTC 3 3 A3 A ACTCCAGG7A CCTGCACAGT AT3CTGAA3A 
TG3CTTT3TA A3A7ATGG7G CACGT7AG7A G3G3GGGAAG ATA 3 AG C GAG 33TA3CT3C2 
TGAATTC3CA G33777ATCA CAATCATCGG 7AAG7T C 3 3 A T3AT23CACC GCA3G7AGG7 
AG77G7CGG7 G73TA73TGT CCGCGCGTAA ACA3TCCACC ACCG73AATT A77AAA3377 
C333GC73TA C2GTCGACCG A3TTTTCCCA AAAGAGT333 TT3TTGA7GT A7AAAAGGG7 
G3A3GCGTTC 3333AGGAG7 AG7CTGCGTA 73GCTCTGCA GGCGAAAAAG GTGGGCTCGG 
GCTGCAT3AT C7TAT3AAGA CCTT3TAAGG 7CAGCTCTGC CTGCAGGTGC GAG77GGTGG 
CCAGACAGCA GAATATTTC3 AGCTGTGAT7 CC3AAGTCGC 77GA7AACAC GTGGTGTGCG 
GACTCGTCG7 3AGGGAGGCG 3T3GGTGG3A GTAGTAGGGG G3CCTC3AGC GCTG2GA733 
AGGCGA337T GGAG 3AACGA CCTTTCCC3T A3373333A3 GGAG33CAAC CTCCTAAC3C 
AGATTAAGGA G733G37G33 GA3GGA3T37 TCAAGAGCTT TCAGGTATTG CTGGGGAAGG 
ACG33AGAGA AGGCA37GTC CG7TTCGAAG CGCTACTGGG CGTA7A7A3C AATG7G3TG3 
AGTT7377AA G77737GGAG ACCGCCCTCG CCGC3G37TG CGTCAATAC3 GAG7T3AA3G 
A337G3GGAG AATG A7 A3 AT GGAAAAATAC AG777AAAAT TTCAATGCCC ACTA77G333 
ACGGAGACGG GAGGAGG 3 33 AA3AAG CAG A GACAGTA7A7 CG7CA7GAAG GGTT33AATA 
AGCA3CACA7 3337G3GGAG ATT GAG CTTG CGGCCGCAGA 3ATC3AG 37T CTC77333C3 
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A3AAA3AGAC 3CCC77GGAC 77 C A C AG AG T ACGCGGG7GC CA7CA 

C777G CA377 7G37A7G3AC GCCC7AGAAC GGGGG77AG7 3GACACGG77 C7CGCAG77A 

AACT7C3GCA CGC7CCACCC GTC7T7A777 7AAAGACGC7 GG3C3A7CCC G7C7AC7C7G 

AGAGGG3CC7 CAAAAAGGCC GTCAAG7C7G ACA7GG7A7C CA7G77CAAG 3CACACC7CA 

7AGAACA77C A777777C7A GA7AAGGCCG AG CT CAT G A C AAGGGGGAAG CAGTATGTC 2 

-AACA-r" CTCCGACA7G CTGGCCGCGG TGTGCGAGGA TACC37C777 AAGGG7G7CA 

GCACG7ACAC CACGGCCTC7 GGGCAGCAGG TGGCCGGCGT CCTGGAGACG ACGGACA3C3 

^ A - C , AG — G 3 CTGA7GAAC C7GC7GGGGC AAGTGGAAAG 7GCCA7G7CC GGGCCCGCG3 

CC7ACGCCAG C7ACGTTGTC AGGGG7GCCA ACC7CG7CAC CGCC377AGC 7ACGGAAGG3 

CGATGAGAAA C777GAACAG 777A7GGCAC GCATAG7GGA CCATCCCAAC GCTC73CCG7 

C7G7GGAAGG TG AC AAGG C C GC7C7GGCGG ACGGACACGA CGAGATTCAG AGAACCCGCA 

7CC-CCGCC7C 7C7CG7CAAG A7AGGGGATA AG777GTGGC CATTGAAA37 77GCA3CGCA 

« AA — - GA"7"AG777 CCC7GCCCAC 7GAAC CG3CG CA7CCAGTAC ACC7A77TC7 

— CCTG773G CC77CACC77 CCCG7GCCCC GC7ACTCGAC ATCC37C7CA G7CAGGGGCG 

7AGAATCCCC GGCCA7CCAG 7CGACCGAGA CGTGGGTGG7 7AA7AAAAAC AAC373CC72 

77TGC77CGG 77ACCAAAAC GCCC7CAAAA GCATA7GCCA CCC7CGAA7G CACAACCCCA 

CCCAGTCAGC CCA3GCAC7A AACCAAGCTT 77CCC3ATCC CGACGGGGGA CA7GGG7AC3 

GT^AGG7A T3A3CAGACG CCAAACA7GA AC C7A77C AG AACG77CCAC CAG7A77ACA 

7GGGGAAAAA C37GGCA777 G77CCCGAT3 7GGCCCAAAA A3CGC7CG7A ACCAC3GAGG 

ATCTACTG CA CCCAACC7C7 CACCG7C7CC 7 C AG A7TGG A GG7CCACCCC 77C777GA77 

7T77TGTG C A CCCC7G7CCT GGAGCGAGAG GA7CGTACCG CGCCACCCAC AGAACAA7G3 

TTG 3 AAA7 AT ACCACAACCG C7CGC7CCAA GGGAG777CA GGAAAGTAGA GGGGCGCAG7 

7CGACGC7G7 3ACGAA7A7G ACACACG7CA 7AGACCAGC7 AAC7A7TGAC GTCATACAGG 

^^-mr-r -AT' -73" 7C7GC7ATG7 AA7CGAAGCA A7GA77CACG 

AGACGGCA77 7GACCCCG\_G aATw-^-ao. xco^.-r* 

GACAGGAAGA AAAA7TCGTG A7GAACATGC CCC7CA77GC CCTGG7CA7T CAAACC7AC7 
GGG7CAACTC GGGAAAAC7G GCG77TGTGA ACAG77A7CA CATGGTTAGA 77CA7CTG7A 
CGCA7A7G3G GAA7GGAAGC A7CCCTAAGG AG3C3CACGG CCACTACCGG AAAA7C77AG 
GCGAGC7CA7 C3CCCTTGAG CAGGCGC77C 7CAA3C7C3C GGGACACGAG AC3G7GGG7C 
GGACGCCGA7 CACACA7C7G G777CGGC7C 7CC7CGACCC GCA7C7GC7G CC7CCC7G 
CC7ACCACGA 7GTC7TTACG GA7C77A7GC AGAAGTCA7C CAGACAACCC A7AA7CAAGA 
TCGGGGA7CA AAAC7ACGAC AA C C CT C AAA ATAGGGCGAC A7TCA7CAAC C7CAGGGG7C 
GC.TGGAGGA CC7AG7CAAT AACC77G77A ACA77TACCA GACAAGGG7C AA7GAGGACC 
ATGACGAGAG ACACGTCC7G GACGTGGCGC CCC7GGACGA GAATGAC^C AACCCGGTCC 
" TC 3AGAAG C7 A7TC7AC7A7 GTT7TAA7GC CGG7G7GCAG TAACGGCCAC ATG7GCGG7A 
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T3GGGGTGGA CTATCAAAAC GTG3C3CT3A C3CTGACTTA CA^CGGGCCC GTCTTTGCGG 1C2I: 

ACGTGGTGAA CGCACAGGAT GAT ATT CT A C TGCACCTGGA GAACGGAACC TTGAAGGAGA 1Z2€ Z 

TTGTGCAGGr A G 3 C 3 AC ATA C3CCC3ACGG TGGACATGA7 CAGGG7G2TG TGCAC3TCG7 1 C 3 2 C 

TTCTGACGTG ZZZTTZZ5~Z AC7CAGGCCG CTCGCGTGAT CAGAAAGGGG GACCCGGGCC 1C3SI 

AGA3TTTT3 3 GAC3CAC3AA TACGGGAAGG ATGTGGCGCA GACCGTGGTT GTTAATGGG7 104-;; 

TTGGTGGGT7 cgcggtggcg gaccggtctc GCGAGGCGGC GGAGACTATG TTTTATCCGG 105 0 0 

- A r-CTTTAA GAAGCTCTAC GCTGACCCGT TGGTGGCTGC CACACTGGAT CCGCTCCTGG 10560 

-AAACTA7G7 CACGAGGCTC CCCAACCAGA GAAACGCGGT GGTCTTTAAC GTGCCATCCA 105 2 0 

ATGTCATGGC AGAATATGAG 3AATGGCACA AGTCGCCCGT CGCGGCGTAT GCCGCGTCTT 10663 

GTCA3GGCAC CCTGGGCGCC ATTAGCGCCA TGGTG AG CAT GCACCAAAAA CTATCTGCC2 

CCAGTTTCAT TT3CCAGGCA AAACACC3CA TGCACCCTGG TTTTGCCATG ACAGTCGTCA 

GGACGGACGA G3TTCTAGCA GAGCACATCG TATACTGCTC CAGGG2GTCG ACATCCATG7 

TTGTGGGGT7 GCCTTCGGTG GTACG3CGCG AGGTACGTTC GGACGCGGTG A CTTTTG AAA 

TTACCCA7GA GATCG2TTCC CT3CACACCG GACTTGGCTA 3TCATCAGTC ATCGCCCCGG 1098 0 

GCCACGTGGG CGG2ATAACT ACAGACATGG GAGTACATTG TCAGGAC3TC TTTATGATTT 

TCCCAGGGGA GG3GTATCAG GACCGCCAGC TG CAT 3 ACT A TATCAAAATG AAAGCGGGGG 
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3CTGCGAGAA CCTG3CCGGT TTGA3TCATG GTCAGCTGGC AAGCTGCGAG 
CGCCGGTGA7 ATCTGACGTT GCCTATTTCC AGACCCCCAG CAACCCCGGG GGGCGT3CGG 
C3T3CGTGG7 G7CG7GTGAT 3CTTACAGTA ACGAAAGCGC AGAGCGTTTG CTCTAGGACC 11340 
ATT CAA.TA C Z AGAC7CCGCG TACGAATGC3 GGTCCACCAA CAACCC3TGG GCTTCGCAGC 11400 
GTGGCTCCCT C33CGACGTG C TAT ACAAT A TCACCTTTCG CCAGACTGCG CTGCCGGGCA 1146 0 
TGTACAGTCC TTGTCGGCAG TTGTTCCACA AGGAAGACAT TATGCGGTAC AATAGGGG3T 11520 
TGTACACTTT GGTTAATGAG T ATTCTG Z C A GGCTTGCTGG GGCCCCCGCC ACCAGCACTA 1158C 
CAGACCTCCA GTACGTCGTG GTCAACGGTA CAGACGTGTT TTTGGACCAG CCTTGCCATA 1164 0 
T3CTGCAGGA GGCCTATCCC ACGCTCGCCG CCAGCCACAG AGTTATGCTT GACGAGTACA 11700 
TGTCAAACAA GCAGACACAC GCCCGAGTAC ACATGGGCCA GTATCTCATT GAAGAGGTGG 1176 0 
CGCCGATGAA GAGACTATTA AAG CTCGGAA ACAAGGTGGT GTATTAGCTA ACCCTTCTAG 11S2 0 
CGTTGGCTAG TCATGGCACT CGACAAGAGT ATAGTGGTTA ACTTCACCTC CAGACTCTTC 118 8 0 
GCTGATGAAC TGGCCGCCCT TCAGTCAAAA AT AG GG AG CG TACTGCCGCT CGGAGATTGC 1194 0 
CAGCG7TTAC AAAATATACA GGGATTGGGC CTGGGGTGCG TATGCTCACG TGAGACATCT 
CC3GACTACA TCCAAATTAT GCAGTATCTA T Z C AAG TG C A CACTCGCTGT CCTGGAGGAG 
GTTCGCCCGG ACAGCCTGCG CCTAACGCGG ATGGATCCGT CTGACAACCT T 3 AG ATAAAA 
AACGTATATG CCCCCTTTTT TCAGTGGGAC AGCAACACCC AG CTAG C AG T GCTACCCCCA 
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2 AG Z Z GAAA33A.-C C; 



rGZA acggattt; 



GTTTGTGTTr 



^TTGTG TTC: 

C2CATGGTCG TGCCGCAGCA ACTG3GGCAC GCTATTCTG2 AG 3A32T3T7 GGTGTATCA2 
_„ CTr2A ^TKS: GGGGGCGCCG G AT GAT 3 7 AA ATATGG2GGA A2TTGAT2TA 
TATAC2ACCA ATGTGTCATT TATGGGGGGC AC AT AT 2 GTC TGGAGGTAGA 2AACACG3A7 
CGACGTACTG CCTTG2GAGT G7TTGA23AT TTGTGGATGT AC2TTTG7AT C2TATCA32. 
TTGGTTCC2A GGGGGTGTTT CGGTCTGCTC ACGGCGCT3G TGCGGCACGA CAGGCAT2C7 
CTGACAGAGG TGTTTGAGGG GGTGGTGCCA GATGAGGTGA CC AG GAT AG A TCTCGACCA2- 

G ^ 3rzTZC GAGATGA.CAT CAT C AG GAT G CGCGTGATGT TCTCTTATTT T2.AGAGTGT1 

AGTT CT AT AT TTAATCTTGG CGGGAGATTG CACGTGTATG CCTACTCGGC AGAGA2TTTG 
GCGGCCTCCT GTTGGTATTC C2CACGCTAA GGATTTGAAG CGGGGGGGGG GTATGGCGT2 
ATCTGATATT CTGTGGGTTG CAAGGACGGA TGACGGCTCC GTGTGTGAAG TTTCCGTGCG 
TGGAGGTAGG AAAAAAA2TA C3GTCTACCT GCCGGACACT GAACGTTGGG TG3TAGAGA1 
CGACGCCAT- AAAGACG22T TZTTGAGCGA CGGGATCGTG GAT AT G G CT C GAAAGCTTGA 

„, , T"AA ATTCTGATAA CGGCTTGAGG ATGGTGCTTT TTTGTTATTG 

„ AC __ GC -^ ^^----tjjt ACCTAGCCCT GTTTCTGT3C CC"7TAAT3 "TACTTSST 
^ CT _ ;CTCA ag: -~ga3T ttgccgagcc CGTTGTGGGA C3TGAG3TGC TCTT"CA3A 
CCCGGTTGAG ATGTCTCGCG 3T7GCGATGA CGCGATTTTC TGTAAACTG Z CTTATACCGT 

iA-A-CACGT TTGGACG3AT TTACCCGAAC TCTA3AC3C3 AGCCG3AC33 

CAG33CTACG GATTACTGTA T3G7C-TTAG AAGGGCTTTT GCAGT7AT3G TTAACAC373 
A7GTGCAGGA GTGA3ATTGT GCCG3GGAGA AACTCAGACC GCATC3CGTA AGGACACTGA 
GTG33AAAAT CTGGTGGCTA TGTTTTOTGT GATTA7CTAT GTTTTAGATC ACAACTGTCA 
CCC3GAAGCA CT3T3TATC3 CGAGCGGCAT 7TTTGACGAG 3GTGATTATG GATTATTTAT 
CTTTCAGCCT CGGA3CGTG3 CTTC3CCTAC CCCTTGC3A3 GT3TC3T3GG AAGATATTTA 
CAAC33GACT TAGCTAGCTC GGCTCTGGAAA CTGTGACCCC TG3C"AATC TATC3ACCCC 
_ =CC _„ T _ ^^---r^ AATAAAGGTG TGTCACTGGT TATACCACGA TTAAAAACCA 
CTCACTGAGA TGTCTTTTTA ATCGCTAAGG GATTATACCG GGATTTAAAA CC3CCCACT3 
„___ T __ AC G ; TAAGAG TT GGG7GCTTGG GGGGTTTTGC ATTGCTCTOT TGTAAACTAT 
ATATAAGTTA AACCAAAATT CG3A33GAGA CAAGGTGACG GTGGTGAGAA CTCAGTT3AG 
AGTCAGAGAA TACAGT3CTA ATCAGGGTAG ATGAGCATGA CTTCTTGGTC TCGAGT3ACC 
GGAGGAATGG TGGAC3G3TC C3T3GTGGTG CGAATGGCCA CCAAGT7TC3 C373ATT3GT 
37TA7AACAG TGCTCTTCCT CCTAG7CA7A GGCGCCTGCG 7C7A3737TG CA77CGC373 
TT737GGCGG C7C3AC7GTG G333GCCACC CCAC7AGGCA GG3CCACC37 GGCG7A, =AG 
_ TCCTTCG .„ CCCTG33 ACC GGAGGCCGGC- TGACATGCAC CGCCGACGGT GGG GAT AG CT 
- -C-ACCG-A3 AATA7ACATG CCAGATTAGA ACGGGG7GT3 TG77A7AA73 
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GA7GGC7A73 3GGGGGCTGT AGA7AA77GA GCGCTG7GC7 777 A7TG7GG GGATA7G3G C 
T737ACA737 G7CTATCA7C GGTAGCCATA AAATGGGCCA T3ACAAC7G3 CACAAG7AAG 
7CG7CCGACA TGTG C7T77G CTTGGCGCTG TATGAC7GCC CTCCA7CCC7 AAGCGGGAC3 144 CZ 
CAC77GA7CG C3C33ACCTG 77C7ACCAGG TAGGTCACCG G3TCAAAT3A TATT773A7C- 144 5 Z 
GTGTTGGACA CCACCGTCTG GCTGGCGCTC AGGGTGCCGG AGTTCAGAGC G7AGA73AA7 14 52 0 
G7CTCAAACG CGGAGGA777 CTCGCCTCCC AACATGTAAA TTGGCCACTG CAGGG 2GCTG 14550 
C7C7TG7CAG TATAGTGTAG AAAATG7ATG GGGAGCGGGC A7 A7TT C G 77 AAGGACGG7T 14 6 4 0 
5CAATGGCGA CCC CAGAATC 7TGGCTGCTG 77GCC77CGA CCGCCGCG77 CACGCGC7CA 14 "O: 
A77G7GGGG7 GGAGCACAGC GATCGCCT7A ATCA7CG7GC A7GCGCAGGA CGC7A7C7CC- 
7AAGCAGCTG CGCCAGTGAG GTCGCGCAGG AAGAAATGC7 CCATGCCCAA 7 A7G AG G C 77 
CTGGTGGGAG 7CTGAG7ACT CGTGACAACG GCGCCCACGC CAG7ACCGGA CGCCTCCG7G 
77G773GTA7 ACGCGGGG7C GATG7AAACA AACAG C7G77 77CCAAGGCA C77C7GAACC 14940 
7GC7GGG333 7GG73TCTAC CCGACACA7G 7CAAACTGTG 73AGCGC7G3 GTCACCCACC 15000 
AC33GG7AAA GCG7AGCATT TGACGACGCT GCTCCCTCGC CCATTAGT7C GG7G7CGAAT 15 06C 
GCCCCC7CCA TAAAGAGG77 GG7GG7GG77 77GA7GGA77 CGTCGATGGT GATG7ACG7C 
GGAA7G7GCA GTC7G7AACA AGGACAGGAC AC7AG7G CG7 CTTGCAGGTG GAAATC7TCG 
CGGTGGTCCG CACACACG7A A37GACCACA 77 C AG CAT C7 TT7CCTGGGC G77C37GAGG 15240 
TTAAGCAGGA AACTCGTGGA GCGGT3TGAC GAGTTCACGG ATGATATAAA TATAAGCT7G 153 00 
3C3TCTTTC7 GAAG CATG AA ACCCAGAA7A GCCGGCAG7G CATC C7777T AATAAAATTC 1 5 360 
GCC7CGTC7A C G 7 AG AG C AG GTTAAAGG7C TG7CCCCGAA TGC7C7GCAG ACAC3GAAAG 1542C 
AC A C AAAA G A GGGGC7CA7A AGCGGC7AAC AGTAAAGGAG AGGAGGCGAA CAGTGCGTGG 15480 
c , cr „ TT::T TG33 AA7AAA AGGGGGCGTG TG7GCCGA7C 37A7GGGTGA GCCA37GGA7 
CCTGGACA73 7GG73AATGA GAAAGAT777 GAGGAGTGTG AACAATTTTT CAG7CAACCZ 
CTTAGGGAGC AAG73G7CGC GGGGG7CAGG GCACTCGACG GC3TCGGTC7 CGC7GACTCT 
CTATGTCACA AAACAGAAAG AC7C7GCC7G CTGA7GGACC 7GGTGGGCAC GGAG7GCT77 
GCGAGGG7G7 GCCGCC7AGA CACCGG7GCG AAATGAAGAG TGTGGCGAGT CCC77ATG7C 
AGTTCCACGG CG7G77T7GC C7G7ACCAG7 GTCGCCAGTG CC7GGCATAC CACG7G7G7G 
ATGGGGGCGC CGAATGCGTT CTCC73CA7A C3CCGGAGAG CG7CA7C7GC GAAC7AACGG 
G7AACTGCAT GC7C3GCAAC A77CAAGAGG GCCAG77777 AGGGCCGG7A CCG7A7CGGA 
C777GGATAA CCAGG7TGAC AGGGACGCA7 ATCACGGGAT GCTAGCGTGT CTGAAACGGG 
A C A7TG 7 3 C 3 G7A77TGCAG ACATGGCCGG ACACCACCG7 AA7CG7GCAG GAAATAGCCC 
7GGGGGACGG CG7CACCGAC ACCATCTCGG CCATTATAGA 7GAAACA77C GG7GAG7G7C 
TTCCCGTACT GGGGGAGGCC CAAGGCGGG7 ACGCCATGGT CTG TAG CATG TATCTGCACG 
77ATCGTC7C CA7C7A77CG A C AAAAACGG 7GTAC AACAG TATGCTAT77 AAA7GCACAA 



15120 
15180 



1554C 
15600 
1566C 
15-720 
15 7 8 0 
15S40 
1590C 
15960 
16C20 
160B0 
16140 
162 OC 
16260 



pcT.a'S9-'o»4-t: 

WO 97/27208 




CGTGTTGACC TTCTGGCTTG GCG2TTAGCC TGCTTCT 



™GC ATTGrCAAG:: GGG7GCGGAC AAAATGGATG CGCATGGTAT i£: = : 
;TC CTGGTTGCCA CCG7TTGGGC CACGTGGTGr TGC2TAC-GAr -£3S: 
TACCGCTGGA GGCCGAGATG ATCTTTTTGA CCTArACCGG 
—^SCCGG TCGGCAGGGT CATCGGGC-G GTTGGTGGTG TGTGGGAAAC GTGTGrTGGr 
AGGGGAGGAA AACCAACT7, 

^7~TAA7C7~ A3T=TTA=T3 7CA3AT7TC7 CTATCTA7C7 "GG7GG7SG CTAT33S33C 
GGGACG3AA7 AA737GCG3A 37CC3ACCG7 7GACGGG37A T=GC=GC=M AGG3C3"37 

„, GGAGGAAC 7GCAGAGGCT GGCGCG7G37 ACGCCGGACC CGGCACTCAC 

CCGTGSACC3 7T3GAG3TGC 7GACCGGCC7 TCTCCGCGCA GGGTCAGACS GAGA3CG3Gr 
CAC7CACCAC ATGGCG3TCG AGGCTGCGGG AACCGTGCGT GGAGAAAGCC TAGA~"C-" 
7GTT7CACA3 AAGGGGCCAG CGGG3ACACG CGACAGGCCA CCCCCCG733 3AC7GAG777 
CAACCC737C AA73"GA7G 7A"C3CTAC C7GGCGAGAC G3CAG7AACG TGTAC7CGGC- -~0A: 
TGC77C77AC 7ATGTGTGTG 777ACGAACG CGGTGGCCG7 3A3GAASAGG AC73G37GCC 
GATACCA373 A3777CCCAG AAGAGCCCGT GCC=C=3=CA CCGGG777AG 73T7CA7GGA "716C 
CGAC77377C A77AACACGA AGCAG7GCGA C777G7GGAC ACGCTA3A3G CCG7C7G7CG 
-CAAGGC 7A3AC37TGA GACAGCGCG7 GCC7G7CGCC A7TCC7CGCG ACG7GGAAA7 
;CA 377AAATCG3 AC77777AGA GGC3TGCCTA G7G77AC3GC- GGC7G3C77C 
5GAGGC7AG7 G3773GA7AA GAG77GCCAC G7CCC7GC" 377GGCCG" ACGCC7377G 
GATGGAC37C- 77AGGA77A7 3GGAAAGCC3 CCCKACACT CTA3G777G3 AG77AC3CGG 
CG7AAA7737 GGCGG~ACGG ACGG7GACTG GTTAGAGA77 T7AAAACAGC CG3A737GCA 175 = 0 
AAAGACAG73 A33GGGAG7C 773TGGCA7G CG7GA7CG7C ACACCC3CA7 7GGAA3CC7G 17580 
GCT7GTG77A C77G3GG377 77G37A7TAA AGGCCGC7A7 AGGGCG7CGA AG3AG3AT" 
GGTGT73AT7 GGA3GCCGCT A7G3C7AGG3 GGAGGCGCAA AC7TC3GAA7 
AGGAA7GCAT ATGGACTGTT AACCCAATGT CAGGGGACCA 7A7CAAGG7C 7 
GCAC „„ AT - TC3 CCGGTG 7ATGACCC7G AG3TGG7AAC CAGC7ACG3A C7GAGGG7GC 17620 
C7GC77ACAA 7GTG7C7G73 3C7A7GT7GC 7GCA7AAAG7 GA7GGGACC3 7G7G73GCTG 178 EC 
TG3GAA77AA G33A3AAATG ATCA7GTAC3 TCGTAAGCCA G7G7G77737 G7GCG3CCGG 
TCCCG3333G G3A7GGTAT3 3G3CTCATC7 AC777GGAGA G777C7GGAG GAAGCA7CCG 
GACTGA3A77 73CCTACA7T GC7C7GC7GC CG7CGCGCGA ACACGTACGT 3AC77GACCA 
GACAA3AA77 AG77CA7A" 7CGCAGGTGG 7GCGCCGCGG CGACG7GACC AA7737AC7A 
TGGG7C7C3A A77GAGGAA7 G7GAAGCC7T TTG77TGGC7 CGG3GGCGGA TG3G7GTGGC 
TGCTG77C77 3GG CGTGGAC 7ACA7GGCG7 7C7G7GCGGG TGTCGACGGA A7GC"7CG7 
TGC-CAAGAG7 OGCCKCCTG C77AC3AGG7 GCGACCACGC AGA77GTG7.C CAC7377AT3 
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GACTZrGTGG ACAC3T7AA7 37A77T3G7G GG7A3T377C 7GC3CAG72G C3GGG7C7A7 
37AACA7C73 7C327G7A7C AAATCATGTG GGACCGGGAA 7GGAG7GAC7 AGGG72AC7G 
GAAACAGAAA TT77C7GGG7 CTTC7G77CG A7CC3A7TGT 2GAGAGGAGG G7AACAG27C 1S4ET 
7GAAGA7AA7 TA32CACCCA ACCCCCA2G3 ACG73GAGAA 7G7G27AACA G3AG7G37CG ISS-lC 
ACGACGGCAC CTTGGTGCCG 7CCG7C2AAG GCACCCTGGG T327C77ACG AA7G7C7GAC 1550! 
7AC77CAGC3 GC77GC7GA7 A7A7GAG7G7 AAAAAAC77A AGGCCCTGGG C77ACG77C7 -366 Z 
TATTGAAGCA 7G77GCGCAC A7CAGCGAGC TGGACCG7CC TCCGGGTCGC G7G7AGA77A 
TGr . TTCCGTT —CCTT7TTG A7G777AAA7 7777GGGGGG GAA3CACCGA CAAAGCG7C7 
7TA7GA7777 CGCGAACACG GAG77GGC7A CG7GCT77TG G7GGG3TACG 7AC23AATG7 
7AA7G77773 7ACGGA7GCC AGTAGCATGC 7GA7GA7CG 2 CACCAC7ATC CA7G7C7TTC 
CGTG7C7C77 7GG7A77AGG AA7ACG777G CCT777GCTT AAACG777G7 AAAACACTGT 
77GGAGTT7C AAA7AAACCG AAG7A273C7 TAAACAATCC AAAGAAC7GG TGCGTC7777 
G7GGGG2777 GATTGAAACC AAAAAGAAAA AAG7G7G CAT 7AGTAGC7GC 7G7TGGAAGG 190ec 
G27C2AGCCA G7GCAC7CCG GGAA3G7AAC AGCCGTTCAG AAAGGACGAA AGG77AACCA 
GAAAA3C37G AAGTTCGCGG 7 AG A C AG AG C AGGCGTGCAG GGAG7CG7G7 G77777C7GG 
^.„ 3Ccr3GTA C7CGACCAGT TGA73GGC2G 7GGAGACGTG CGCGTCCTCG CGCACACACC 
3CA7C7GCAA G7A7GT7GA7 AGGGAC7CCA A7AGGCGCGG CTTTGCGGGG ACG77G7CCT 
CGGA23G7C7 GGGGG7TCCC ACGTCGGGAT TTGCTGACGT GGGGG7GGCG GGA7GG7GCC 
G737G3AGTA 7GT7722AGG ACCGAA2737 A7GAG7T7A7 7C7G7GCACC ACGCCAATAA 
AAGGGTGCSC 2A733GTGCC G7777GGGAC AGTG7C3CG7 GAATGTCGGG GCACTCAGTT 
CC-ACCrrTC TCCGGCGTCT 7TGGGGG7C7 CC7GCAGGTT ggcggcaagg cgctccctgt 
GACG3C7GAG CAGGA7G77T G C777G AG 7T CGCTC3TGTC CGAGGGTGAC CCGGAGGTGA 
CCAG7AGGTA CG7CAAGGGC GTACAA77T G CCCTGGACC7 7AGCGAGAAC ACACC7GGAC 
AATT7AAG7T GA7AGAAAC7 CCC37GAACA GC7TCC7C77 GG777CCAAC GTGATGCCCG 
AGG 7 C CAG C C AA7CTG C AG T GGCCGGC7GG CC77GCGGCC AG AC77T AG 7 AA727C3AC7 
TGCGTAGACT GGAGAAGC7C CAGAGAG73C TCGGGCAGGG 7T7CGGGGCG GCGGG7GAGG 19B6 
; _ tcg - a;:t GGAC CCG7CT CACG7A3AAA CACACGAAAA GGG3CAGG7G 7TC7ACAACC 
AC7A7G37AC CGAGGAG7GG ACG7GGGC7T 7GACTCTGAA 7AAGGA7GCG CTCC77GGGG 
AGGC7G7AGA 7GGCC7G7G7 GACCCC3GAA C77GGAAGGG TCTTCTTCCT GACGACCCCC 
T7CCG773CT ATGGCTGCTG 77CAACGGAC CCGCG7C777 77G7CGGGCC GAC7G77GG2 2C10C 
TGTACAAGCA GCACTGCGG7 TACCCGGGC7 CGG7GC7AC7 TCCAGGTCAC ATGTACTGCTC 2 016 0 
CCAAACGGGA 73TT77GTCG 7TCG77AATC ATGCCCTGAA GTACAC2AAG 777C7ATACG 
GAGA77TT7C CGGGACA7GG GCGGCGG27T GCCGCCCGCC A7TCG37AC7 7C7CGGA7AC 
AAAGGG7AG7 GAG7CAGATG AAAAT C AT AG ATGCTTCCGA CACTTACA77 733CA3ACC7 
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T -^- A ~~ T rAGGAAAATA 3=ATAATT3r GG3T3AGGG3 Ar"A=373:. 

G7GGAA7CGT AGTGTTGA37 G3AAAAGGG.-. C~A37ATA7 AACAG3CAA7 _ . . ^ 

--TAA A ATC GTA7GACA7A "GSGGATCA 

aaag37GT" aagtaCoG^ --"^ — " 

TCACGA7GA7 3AAG3AGAA7 GGAGTGAA" AA7TG7AAAA GAGAGTT7A7 TAAGT=3Gr7 
CTGGAGGCGA A GAT GAACAG GAGGGGAGC7 GTATCGGTAT T7GATGG7TT 7GG333.A3C 
AGCG"3T3T 77GAGAAGCA 37TTCAGGAC GCACAGCA7G CCG7CAGG3C 3CA0G3TGCA 
CTGAAGGGGG AAGCC3AGC7 3GGGAGTCTG GTACGGAAGG CGGGCCAGAG GTTTGAGGC3 
C7GAAAAGGG AAC33TCAA7 7T7GCG7CAG CCGCGC3A" 7CCCAC3GG7 7GGC3AGAT7 
3ACGG7CT33 TC3AC3CC3T GGCGGAGC7G AAAGAAGA3G 7GGC=G7GCG 7GTAGA73CG 
CTGGAAGAGA ATGC-A3AG3A 3ACGCCGAGT CACTCCTCTT CGGAGATCAA GGACACAATC 
GTCA33TGGA G3CTTGA7GA TTTGCCCCrS G7GTGCCG7G AAAC7GCCTA AGGGTACCCG 
A3AC crTGG3 C37C3AGATG GCAGG7GAAT CAGCATATAC AGGT3TC3AA 
GA-AAAAAG GCGACCGCGT A77TTAAAGC GCCCCGTGAA TGGGGGCAGT GCACG.ACGA 
GGATGCA3AC 7GG7G3AA3C GTGT3GG7CG TG3CGCCT7T GGCATAATCG TC C CTAT C7 7 
CGAG3ATGTG T3TGTGAA3G AGTTTGATAG CGGCCGG3AG TTTTTC7ACG AGGGAATTG7 
CAACGAC77G A73CAG3"A 373GAGAGAG GTA3C3CATG CATTCTGOTS GA7C7A3AC7 

, „ G-A-AGCC7G 7AGA7GGA7T GTGTATC7TA GAAT3AAGTG 

CAACC7GC7G CAGC7GGACT GGA3TCAGG7 CAAC77GA3T GTCA7GGCGG CGGAG7TCAC 
CGGC=TAATa 3CG3CGG737 C37773TAAA CAGATAC7GT GGCA7GGTGC AC73C3ACG7 
TA3777AGAC AATAT7TTGG C7ACAGGAGA C77AACG7C3 ATGAACCC73 GGAG3C7GG7 
CCTTAGCGAT 7TC3G77CCG 77G3G3TACA 3TCTGGGAGC AAGTGGACTA ACC7T3TGGT 
3ACC7C7AAC C7GG3G777A A3CAACACTC- 7TACGACT7C AGGGTGCCAC CCAAAGT7A7 
TTGTAAGGA7 CT7TATAAGC CG77TTGC37 =CT=TTCCAG TGT7AC GTAT C7AG777CGG 

— _ T »„-^^- c AGCCC7AACA TGGGACTGA7 

7AA3A73CAC 3CGGAG3TAT 

CATCGACA7C- TCC7CG7TGG G7TACACTC7 GCTGACA7GC 7TGGAAC7C.T A7C7CGAT37 
GCCGC7AAAC AAGCCTGTGA AG7TCT7GGG TTCAGC7A77 AGAGACGGA7 GC7CCGAAC7 
CATG7ACTAC 7TGGGCTTCA TGATTCGCAG GG7GG7GA7G AC7GAGAT3C TG7CC3C7G7 
G7GGACCA7G ACG7T7GACC 7GGGACTAGA T7GCAC7GGC AAAG"CAGG CGATTCGGAT 
GCGACAGGA3 CACCAGCTGG CGTT7CAGAA GCAG7GCTA7 77A7A7AAAG CCAAGCAAAA 
GGGAGAGTCG TTAGC3AAC7 G77CCGATAA GCTAAACTGC CCCATGTTAA A37CTCT7G. 
-AGAAA3G7A C7 AGAG CG AG A7TT77TCAA CCATGGAGGC GACCCCGACA CCCGCGGAC7 

„ „ A AGAGTA7C7G GT7GACACCC 7GGA7GGGT7 AACAGTGGAT GACCAACAGG 

C7G7GC7CGC AAGGTTGAGG 7777CAAAG7 7TC7AAAGCA CG C CAAGGTT CGAGAC7GG7 
GC3CACAGGC CAA3A7CCAA "3AGCA7GC CTGCGC7GCG CATGGC.TAC AAC7A7777C 
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ttt7Ttcaaa agtgggcgag tttattggta gtgaggatgt gtgtaact77 ttcgtggacc 224-- 

GTGTGTTTGG 7GGTGTCAGG TTACTGGACG TGGCCAGCGT GTACGCCG22 TGTTCGCAAA 21 = :: 

TGAACGCACA T C AG C GG C AC C AC AT CT G CT GTCTAGTGGA GAGGGCCAC7 AG7AG7CAGA ZZS€C 

GTCTGAACCC CG7GTGGGAC GCCCTGCGAG ACGGAATTAT ATCTTCATCC AAGTT7CA27 22£2: 

GGGGAGTTAA ACAACAGAAC A C77 CAAAAA AG AT ATT C AG CCCA7GGCC7 ATAACGAACA 22 c 6 : 

A~CACTTTG7 2GCGGGCC2G CTTGCCTTTG GGCTGCGGTG CGAGGAGGTG GTGAAAACGT 22 74C 

~GCTGGCCAC CGTTTTGCAC CCGGACGAGA CAAATTGTCT CGATTATGGG TTTATGCAGA 22S0C 

GTCCGCAAAA TGGAATATTT GGCGTGTCGC TGGATTTCGC GGCGAACGTC AAAAC7GACA 22S6C 

CCGAGGGTCG TGTACAGTTT GACCCTAACT GTAAAGTGTA TGAAATAAAA TGCAGGTTCA 22 92 C 

AGTACACGTT TGCGAAAATG GAGTGTGACC CCATATACGC CGCGTATCAG CGGCTGTACG 2298? 

AGGCACCCGG AAAGCTGGCA CTGAAGGACT TCTTCTATAG CATTTCCAAG CCTGCGGTTG 2 3 040 

AG TACGTGG G ACTTGGAAAA CTGCCCAGTG AATCTGATTA CTTGGTGGCT TATGATCAGG 23 IOC 

AATGGGAGGC GTGTCCTCGC AAAAAGAGGA AATTAACGCC CCTTCACAAT CTTATTAGGG 2 3 16C 

AGTGTATTTT GCACAACTCG ACCACGGAGT CTGACGTCTA CGTACTTACT GATCCTCAAG 2 322: 

ATACTCGGGG TC AAAT C AG T A77AAAGCCC GCTTCAAAGC CAACCTC77C GTGAACGTCC 23280 

GTCACAGCTA 7777TATCAG G7A77GC7GC AG AG TT C G AT CG7CGAGGAG TACA77GGCC 23 34 0 

TAG AT AG CGG CATTCCTCGC CTCGGATCAC CGAAATACTA CATCGCCACC GGCTTGTTCA 2 34 0C 

GAAAG GGGGG CTAT CAGGA7 CCTGTCAACT G7ACCA7CGG TGGCGATGC7 TTAGACCCGC 2 3460 

ACGTGGAGA7 7CCTACGCTG CTAATCGTA^ "ZCZZGTCT* CTTTCCCCGA GGCGCAAAGC 2 3 520 

ATCGTCTGC7 7CACCAAGC7 G C C AAC7777 GGTCAAGAAG TGCGAAGGAC ACC777CCA7 2 3 580 

ATATCAAATG GGA777C7CC TATCTATCTG CAAACG7CCC 7CACAGCGCG 7AGACGTGGA 2364 C 

CGGGGAACCG C7CGACG7AG TCGTGGACTA 7GACCCCA77 CGCGTTTCAG AAAAGGG CAT 2 3 70C 

G77GCTTGAG CAATCGCAA7 CCCCATATCC CGCATTAAAA AAGAAGAAAA AAAA7AAAGA 2376 0 

AGCAA777A7 7AAGCAAACA GTATGGTTTT CTGTACGTAT 7TTA7TCCGT GGTGGGTGAA 2 3B2C 

AAATAACGGG GGATGGAGGA AGAGGGATGG GTTTATAATG CCAATATATC AGCTAAATGA 2 38 80 

AT AT C A7TTG CG7TTCGTCG ATTTCACTGT CACTTTCATG GTCGGACTGG TATTGGGTCC 2 3 940 

TCGGGGCGGG CGTCGATATG TCCTTCACTT TGGCGCGGGC TCTGGTCTTT GCTGGGAGGG 2 4 0CG 

GCGGCGGTTT CTGGTGAACA GTCGGAGTTC TATCGACCGT CGGCGCCGAC GTCGCCAGAG 24 06 0 

GC ATGT A7G Z CGCACTCGGC GTACAGAGTC CCCAGTCGCT CCTTATAACG CGTATAACGA 24120 

TGGCTAGGAT GCACAGTATA GG G AT AC AG G AGATATTGAT AGCCACTATG TAGTGGAGAT 24 180 

TAGCCTGCAC GAACGCG777 7CA7ACCTGA TGACAGGCAG CAGTAGAATC AGATAACCCA 2424 0 

CCAATACTCC CACGTAAAAG CCTACCTGCC GTCTCATAAA CTTTACCAGG AAAAATTCCG 24 3 00 

TG7TTATGTA CCACACGACC GTCAAGGCTA GGAACATGTT CACCGCACCA AAAATGGCGT 24 36 0 

C7GACACGAG CACGTAAAAG CTGTTGCCAA CGG C CATC AT GGTGCTCAAT G AAAACAG C A 244 2 0 
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GCATTTCCAA GGCGGTTGTT GATAGGTACA GGTTGACGCA GACCGGTTTC CACCGAGTCA 
GCAGTGACTC CAT CATGGT A TTATCAGGTA CGTGCTGTTC CAGGAGAGGT ATTTCCCAC7 
GGGCGGAGTT ACATGTTATC AGTGACTGGA TGTGGGCAAA G GAT AT G C AA AAA.GAATG2 
AG TAG AC AAA GGCTGCCATA AGTACGTGTT TATATGACAG AACATGGATA AACA37.GCA 
TGCTCCACAT CCTTAAGATG G CG A CAT AAA GCACGCTATG TGATCCAAGT AGCGCTATCr 
AGGATTGCAT GCTCATCATG GTAGTGGCGT GAACATGCTT GGCCCGATAT ACGGC2ACCG 
CCGCGAGAZA GTAGTATACT ATGGCAATGC CGTCCACGAT AAAAGTCCAA AATATGTACA 
CCAGCATCTC TGGTTTCTCT AAAAACAGGG TCGGGGTGAG GTGCTTCGCT GAGTTGCGCA 
CCGTGAGGTT TAGCGCGCTG TAGTTTACCA GATTGTTGAA GTAGCAGGGG AAACCAAGGC 
CCTCGCACGT GGCGGCCATG GGCACGACTG CAGAGCAAAT GTACATAATT ACAGCCACAA 
ACAACAGCTT G AC C C AG GAG GACATGAGAA AACGGTCGCT CTTTGAAGCG CGCATGTTT7 
TCGGTGTTTT TAACTTTCGC CAGGCGGCGC TGCGGCGGGA GAGCCAATCT GATGC7ACTG 
CCTATCGCGG TTGACTTTTA AATACGCGCC CCGGGCAGAA GCC AG AGGTA GTCGACTCA7 
TGAC7CAA7G G C AA C G AG C G AAGAAACGGC GGCCGGTTAT GTCATCGGTG TCTACTTTCA 2 5 26. 

GTCC ACTGCC G C ATT ATT G 7 CTGGCAGGTT AATTTTCTAC CCCTGGACCC 
AAACGACGGG GAGACTGAAT GCTACTTTGT GGTGGACACC- CTGACGAAAG AGGCGATGGA 
G2GCATG2C7 GAAATCCAGG AATGCGTCCC GTCTATTACT GAACACGCCC GTGACCGG7 
GATCTGGGAG TTGGCGCTGC GACTGCAGAA 7CAGACGA7C GTCAAGGC2G TC2GGA2AGC 
GTCG2TTCCG GTGGTTCTAA TTATGACTG7 GGG7CGCA7A GTGAATGATG TGATTCCCTG 25560 
CCCCAA2GC2 AGAACACC2A GACCACTAGC CTGTGCTTAC CTACAC7G7G AGGC3ACGG7 25620 
GACCTTTGAG GTCCCACTAA CCGGGCCCGC GGCGTCCACC GGAACG7GGC A2AG2TCTAT 
CTATAGGGAA TGTGCGATC7 CGGCTATCGA GATATGCT7G AAGACCAGTC G AG G CAT AT A 
CTC2TGCCAG TCGAACGAGG CC 2 CTGAGG 2 CAAGAGGGAA AAGCGAGGTT TAGACATATC 
AGATGTGT77 G7CTGTCTCA CGTATGATAT C27TATCG2A GGG2GG3T2C TTTCTCTGC7 
GGTGCCCCAC GCGCCCGCTT TTCACGTCTT ATGGATCAAT GAGGACAG2A AGTGGAACGG 
GGCAG2CGTC GAATTTTTCA GAG2CCTACA CC AT AAGCTG TTCAGTGAA2 GCAAT3GTA7 2 5 980 
ACCCC7TCTG TGGTTGTACG TGT72CCGGG AGC7GTGGAA GAGGGCACAG CCTTTGCGCC 
„~TZCCCTT GCATACCTTT GCGGTATGGG T2GCCTAC7T CT CT G G AC AG 
GG2GTCCGTG CAGTGGGAC 2 TA77TGAACC GCACATCC7G AC22ACTTTG ACGGGATAAA 
GCGAACTTCT TTGG CAGATA 2AGTGTTTGG GTACGA2T22 CTGG2CATTT CAAGGGAATG 2622: 
TGAAGAT2AG TATGTGTGG2 CCA7GCCTGT CACTGACATT AATATTAATT TGTG2ACGGA 2 62 BO 

_ GAA'- — -GG"CTG GTGG7CGTGA A727AGAAG7 26340 

TAGTGACAC7 ATGGCCATw.- .^o«G«^-^ - ~ 

CCTGTTGCGC 

AGT3CCGTAG GAGCGTGGAG CTTAGATACA « ~ 
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GTCGACTGTA TT AT CAT G G T CCACCTCTAG GGGTCACAAA TGGGCCGCAA TCGTGAAG7G 
GAAGTTATTT TTCCTCGTCC AAGCTTTGGA G227GAGGTG AGACCTACTG TCCCTGCT7G 
AAGCGGAGAG GGGGTGGTGC - GAGTTGGCAG TTGACG.33TT T3TGATAGC7 GGAGT37TGA 
CCACGGCACA G G A C C C ATT A ACTTTG7TAT GTGTTTATTT TTAG GAATGG TCTCGAGAAT 
TCAAGGATCT CAAAAGGG C 2 TGCCAGATGG CCGGGTTTAC TCTGAAGGGG GGGA7TTCGG 
GGGATCTTG7 ATTCT7ATCG CATG CGAACT TGCTCTTTTC AAC7TGGATG G GAT AT TT C 7 26S2C 
TCCATGCAGG CAGTCCAAGG TCGACAGCGG GGACGGGGGG TGAGCCTAAC CCACGTCACA 263 EC 
T2ACCGGACC AGA2A7TGAG GGAAATGGGG AACACAGAAA 7TCC7CCAAC C7CTGCGGC7 
TTGTTACCTG G7TG7AAAGC 7TAACCACAT GCATTGAACG AGCC2TAAA2 ATGCCTCCCG 
ACA7TTCC7G GCTGCAGCTG ATAGAGGAAG TGATACCCCT GTATTTTCAT AGGCGAAGAC 
AAA GAT C ATT CTGGCTCATC CCCCTATCGC ACTGTGAAGG GATCC2AGTA TGCCCCCCT7 
TACCATTTGA 7TGCCTAGCA CCAAGGCTGT TTATAGTAAC AAAGTCCGGA C7CATGTGT7 
ACCGGG2AGG 77777CGC7T CCTGTGGATG TTAATTACCT GTTCTATTTA GAGCAGACTC 2 724 0 
TGAAAGCTGT 7CGG2AAGTT AG7GGACAGG AACACAACCC CCAAGACG2A AAGGAAATGA 27300 
CTCTACAGCT AGAGG2CTGG ACCAGG2TTT TATCTTTATT TTGAAAAAAG GGAAACAATG 
GGGGGTTTGA AAAGGGTGCA CATTTT2AGA TATTTTAAAA CTTCATTGTT 7TC7AGGTGC 
TTGGTAAAGA TGGTAT2A7A ATAAAAAATG TTTA7TGGGT CCGCGCAGGT TTGTTTGTCA 

^„_ rA T CT 2CA7TAGA CTCCAGTTTA AAAGACT7TA GATAAATGGG TTTCATTAGT 

2G2CCCATGG GGGTT3AAGG GT2G27TAT2 G C CTTATG AA G7TTAAA7AT AACGAGTGGG 276C0 
GTGGC7CTGA AATGAT7GT2 CACGGACAGC TCGTAAACAA AGGCGGCCGT GGCAGTCAAC 27660 
GT2T2TATA2 7GTG2ATGAC GAAGG27GCG TCCATCCGCG GGGT77T7TC ATGTGTCTTT 2 772 0 
GTGG23CGA2 AAATAA7AGA 7 77 7 AAAAA 2 G77GG7GA2A 7GT7T7GA2A GTT772GAG7 27780 
AT2GA7AA7A GGGAG2AGAG CTOGGTTATG CCGGGAGATG TAGGTGTAAG GAGG7A7A7T 2784 C 
CG7T2TTGGA ACACGTGAGG GTGTAGG727 A7GTGGGT7A CCATGTCTTC GTG7722ACC 27900 
AGGCACA22A CCGTAAATC2 CA2AAAG77G GGCGAGGACA GGCGAGATT7 2A2GTGCTC7 2 796 0 
7TGAGACACG 7TATATCTAA GTGG222ATC ACGGACA777 TGGGGGTATT G7TT22AA77 
AGTGCGTTG7 TTTT2 2TA7G 2A7TTG2AGG ACAAGG7GGG G7AGGACAGG GTGGGGGTAT 
ACGGGACAGG C7T7TT7TGA CT73CGAGT2 7TCGGGG GAT GAG7AG77A7 7GG2A2T7CA 2 614 0 
G72AGT2T7G 7CAGGG7G77 7T72AGGGAC A7777CGAAG GG7GG7G7AA C7AGA2AG7A 26200 
TTT7TGTG2 2 ACGTGGGT7A 7A7AGACAAA GAGT7TG7TA G7 27GA7A7A AA7AGG77G2 
GA7GTC77G2 AAGCTGGAGG A7ACGAAGGA G7GA77AATG AG77G2AT27 GAAGGAGG72 
CGGGA7CAGA TACGTGAATG GAC2AAGCAG GATGGATATG G7GT77TGAG AATAGGTGA2 
GCTGAGGCGC TGGG7TTGGT TGT2AACAA2 GGGAGC7AGG 77 G TAG G 777 G AAA GAT 27 2 
GC „ CCCAC AGGTrcGTGA GAT GTTT 2 A7 G7TTT7T7T2 A2TGGGGGTA 7G7AAGAAGA 
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GAAAAAG77A TTTAGCACGG :AC733C73A 7GGGATA7GG GAAGACG7TA GC7G3 

GGGG7CC7G7 AAAGGT3GCA GAGA7TGAAA TGTGT7GGCG GTCAGCAGA7 ^»==-- 

G33AC=CTTT GCGTGA33GG G37GTTGGTG T3AGAG3737 G7~GAA7AC A7777AG". 

C777AT3GAG AGCTCCCTCT 3 CT77T C AAG TTGAGTTATT G7G7CAAA77 G77GG777A7 

:TG3TTG3 TG afflttCTTK AAACGCTGT7 GGACACC7GG C3CC73A3C3 "7GAG7337 



CST ~ CTT3G CCTGTG GCGA A7AGTT7A77 CTTGTCTAC7 A7G7777GG3 



C3TCG3T :sss: 
2ss:: 



GACAAAG7CC TCGACGACG7 CGG73ACAC3 GCTGAG7G7C T73777TC7G CGAG7773A7 
GAGCAGG77G AGGAGC73TC G377GGGGTC TGTTCTCTGA GAGGCCTGCT CCAGG7G337 
3A73A73TC7 7737ACACA7 TG77ACAGGC GC77GGAAGC- AGGGC377GG 7GG3GG3737 
G77CAGGA3G TGGCAAAG77 T7G7GTGC7C TGC3GTCGGG 7GACAGC7CA 7AA7GC73G7 
ATACA7GG7C TGAA7GGGG3 TG7CAAAGA7 CACCCGC77A GCCAAGA7GG CGGGCA7AG7 

„ AC «_- AA - c: .T77TCTGC77 ATACAA7C ZZ ACGAAAG7G7 7777AA3AC.-. 

G73A7A3737 A7337CAG37 C7GAG7AGGC CGGAA7A7AG AGGGCG377A AAC7AGA3A2 
CAGG77GG7A A737CG7GAG 7CAC3G7G37 GAG7A7CGG3 337A7G3777 7773A"AGA 
G333A3ACGC 7G3CAA7C77 73ATCAGCTG 77CCTGGATA GAG7TAACGA 3=77373371 
GGG7G7G7G3 77GACGACTG G7AGGA77C3 TAC3GTGA0C AGCCAG737A CG7A73737. 2 5-5 6 
A7AC3AGA3C 7G737C77GG C37AGAGGAC GCGG77GA7G G3A77GAGAA GCA 
TAA7377A73 C3CA7AG7C7 G33C3CAGGA 37C3AAGG77 GA33777737 
c _ 3 _- c ___ r __ T _ CT3GCC AC37337777 73C73A33AC 7C37A7G73C 
CAAGAC37G3 7CS7AG37AC A37733CCAA. 7G3A77C773 7A3A3373GA 7AAA7A3773 
— 73AAAAAA ACACCCGGG7 77CG3AGGr7 GCA37G7A3A 37373AC-TC 73ACA7AA3A 
AT ._ C _ TGCC - TG .- AGGATCT CAAAGAGGGA GA7GGA3A37 73GGAAGG37 GCA773A7A7 2 9E2 0 
G3ACGAG3C3 A3CC3CG3G7 7CA7CC7CAA CA73A3A7C3 GA7GCCAAA3 7CA3GAGr37 253E: 
A37GGAACAG A77GACAGG7 7373AAA7A7 CA77A3G7GG "33333A3A 7G3337337A Z99^ 
7GA337AGA3 773GA733A3 7G3AA3A33A A333CCC777 C7G3337777 C33rATACG7 300CC 
AATAACGGGG ACTGGAGGAG CGGGGAAAAG CACCAGCGTA TZCoC.n. 

=AACTGC „ ATTACGGGGG 2TAGAG7GG7 AGCGGCACAG AA7G7T77 7A G3GG777AAA 2: 12 0 

GCAGAGGAAA G7GCGCAAGG 7AAG7GAG7C C7CCATCGAG GAAG7C2AGA GA7A7GAG77 2 0=4: 
\Z TGGCCAAC7G 7ZAC2GA7A7 TA77GGAGAA 7T7A7GCGGA AG AAA 2 AAA.-. ;^C. 

Z Z 37A 7G G G 7 G 3 AG Z 3 Z- 3 £ 0 
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GGGGGAGTAT AGCTCCCTC7 G7GAAAGCGC T. . 
CAAT77GTGG ACGAG7AACA 77A7GGTGA7 AGACGAAGC7 GGAACCZ7 77 CG7 7 2 7A7A7 
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TCAGTCGGTC T7CAACCACA CGCAGCAGAG AAACGAGA7A 7CTGCC7G7G A7AA7G7GC7 
CACCTTCC7A TTGGGAAAAC G7GAGG7TGC AGA77ATAT7 AGGC7GGACG AGAA77GGGC 
CC7AT7TA7A AACAA7AAGC GCTGTACGGA TCCCCAG7T7 GG7CAC77GC 7GAAGACC77 
AGAA7A7AA7 CTAGACATAT CAC C AGAG77 AATGGACTA7 A7AGA7AGG7 77G7G G77 Z Z 
GAAGAGTAAG A7TC7GGACC CGC7CGAG7A TGCAGGGTGG ACAAGAC7C7 7CA7C7CACA 
CCAGGAGGTG AAG7CT7T7C 7GGCAACGC7 GCACACCTGC C7G7CGAG7A ATAAGGA7GC 
7G7G7CCACA AAGC7TTTCA CCTGCCCAGT GGTCTGTGAG GTGTTTACAG AG CC ATTTG A 
GGAG7ACAAA CGGGCGGTAG GCC7CACACA CATGAC7C C Z ATAGAATGGG T AA C AAAAAA 
TC — . t=a gg CTAAGTAACT ACTCGCAG77 TGCTGATCAG GACATGGCTG TGG77GGGAC 
CT A7 AT CAC A GACGCGTCCA CACAGATCAC CTTCGCCACT AAATTTGTCA AAAACAGCTA 
7GCTACCC77 AC7GGAAAGA CCAAAAAA7G TATA7GCGGG 77TCACGGG7 CA7ACCAAAG 
A -~-^~~~Z A7CC7AGACG GGG AG CTA77 TATCGAAAG7 CA7TCGCACG A7AACCCCGC 
77ATG7G7AC AG777CC77A G7ACCC7GC7 A7ATAATG C Z A7G7AC7CAT 777ACGCGCA 
CGGGG7GAAG CAGGGGCA7G AAGAAT7CCT CAGGGACC7C AGGGAAC7GC CGGTGTCTCA 
AGAGC7GA7C 7 CTG AG A7G A GC7CCGAGGA CG77C7GGGG CAGGAGGGGG ACACAGA7GC 
C77C7ACC7C ACCGCCAGCC TCCCACCA7C CCZC?.CCZ^C GCGGC7C77C CAACAC7GG7 
GGCC7A7TAC TCCGGGGCCA AGGAAC7A77 C7GCAACAGG CTGGCCCTGG CACGCCGACA 
C777GG7GAC GAG7TCC7CC A C7 C C G A777 77CAACG777 ACGG7GAACA TCGTGGTGCG- 
AGA7GGCGTG GAC777G7G7 CCAC77CCCC CGGGC7CCAC GG7C7AG7GG CA7ACGCA7C 
C ACT A 7 AG AC ACC7A7A7AA 7CCAGGGA7A 7ACG77CC7C CCAG7GAGA7 TZGGZZGTZZ 
AGGAGGACAG CGCCTCAGCG AGGACC7GCC CAGAAAGA7G CCC7CCA7AG 77G7CCAGGA 
ct _ t;:gG3 3 ttcattgc;:t GCC7GGAAAA 7AACGTCACC AAGA7GACAG AGACCC7CGA 
AGG7GGCGAC G7G77TAACA 7A7G77G7GC AGGGGAC7AC GG7A7CAG77 C7AA7C7GGC 
TAT3AC: - ATA G7GAAGGCAC AGGGGG777C AC7AAG7AGG G7GGCCA7A7 CG77CGGCAA 
CCACCGCAAT A7CAGAGCCA G7C7AG7G7A TG7GGG7G7A 7CCAGGGCCA 7CGACGCTCG 
77ACC7GG7A A7GGACAGTA A7CCCC77AA G C7AA7GG AC CGCGG7GACG CCCA37CCCC 
A7CC7CAAAG 7ACA7CA7CA AAGCCC7A7G CAACCCCAAG AC7AC7C7GA TC7AC7GACC 
CGTACCCCTC 7C77AGGACA CTGA7G7G77 7GG G AA7 AAA GCA7GAGAC7 TGACAC C7AT 
AA7GG7CTGT A7TGACACCA 77C7777A77 7A7CAG7CCA GCCACGGCCA GT7A7A7GCA 
CCG7T7CCAC ACAGGGGTGG CG7GGAGGCC AGGA7GCG3C- 7TGGG7CGCT GCACC7GGAC 
CC2GCGG7AG 77G7GC7TCC TGATGAAA7C GAG7GGGCGG AAG7AC7GGG AGA77GGG77 
GGGAGG7GAC CC777G7GC7 CGACGGAGAC ACGA7CACGC TCACGGCGGA CGA3GG C7C Z 
TCGTC7GTG7 CACTCCCCGA GGA7A7AA77 A7CACGGACG CCACTGC777 GCGGC77AAC- 
77TGG77GTC 7C7GGCAGCG CACCACA7CC 7CGC7ACCAG AGGAGGCGGT AGAC7GCC77 
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TTGCGCTTCT GSCCCACZTC CATGAGCCCG ATTCTCTGAC TCAATACT7C CCCTTGGTC7 
TCTCC3TCCT CCTCGGACGA GGGTGGCTGG TGGGAAAAAT GGCGCGCGTC GGTAAACGCG 
GCC7CATTGT TCACGTCCGG AGAGTTGGAA CTGTCATCGC TAT GAG AG T C CGATG7CAGG 
TCGACGATCC- CGGTGGGTGC GGCGCGCAGG GGGCGCCACG AGGGCCCTTC ATCAGGGTCG 
CTGTATGGTG AACTTTGTGT TGCAGGTACA CTATTT CTGG AAGCAGGT3A AAG7CCGTA7 
GCCCCGGTCC CAGTGTATGC CGCCATCGGT T C C AGG AT AG CAACCCCCTC GTCGTCTGAA 
GG7GAGAGCC CAGCAGGGGA AAATGCGTCA TCCTGACTAA CCCATCCGAT GGACGCCTCG 
GACTCCGCCG TGTCCGTTGA ACTGCGCACG CGGCCCGCTA CCACTGCTAC 
GTA7GGG2CC GTCTGGCCAG AGGCCTCGGG CGCAAGTGAG ATAAAGGT7G 
GCAGGGTACC CCTCTGGCTC GTCTTCCTCC TGAACATCGT CATTTTCTTC TTCATCTTCA 
TCTTCCTCAT CCTCGTCATA TT G AG ATT CG CCGCTCGACT GATCCGGGGA TATCTGTAGA 
TCCAGAGGGG TTGCTGGCGG CGATGGCGTG TCCTCGGCGA AGACGTCG7C TGGGGCAGAC 
ATATCTATCA CGGTGGGTGC AG CAT AG C CG CGCGGCCTGC CAAATCCTGG AAGTGATGAA 
AGA33CGGAG GTGGGAATAT GAAC7TCACG GGGGGTCGTC TGCGAGGGGC TCCTTCAATT 
GGAAGCATTC TCTCTTCATC GTGTGTGCTA GACGAGGTCC TCACAAACAT CGCCATGGCC 
TTGTACGGGG TTGAC'CGCTA GGGGCGGAAA TTTACAAAGC ACACGAGTTA TTGCCTTTAC 
TGCTCCAACA GGCCCCAG7C CACAGTCTCA CGCCGGTGGC GAG7CAAA7A GT 

TG r AAAGT GVTACAGCC CTGGAACCGA GGCCATCGCG AGTGTCGGCC ACCAAG 

G C C AG C G G AG A7GGA7GC7G GGCCG7AAGC ACCAGGTGTT TCTGTGCGTT TATGAGCGGA 
GTTC7G7CAA TGGCCTTGCG C2C2CACAGG AGAAAAACGC AA7G7727AA C7TTGAGGA7 
ATGCTACTGA TGATGAAACT CGTGAACCAA TCCCAGC2AA GTCCCTCGT3 TGAGCCGGCC 3364: 
CT2CCCTTCT CCACCGTCAA AAC7G7GT77 AG7AGCAACA CACCC7332G AGCC2AGCT 
TCGAGGCACC CGTGGGAAGG AGTACTGAAA T7GGGGACGG AAGCC7C7A3 CTC7C7AAAG 3 2 56 
ATGCTTCTCA AACTGGGTGG AACC7GACA7 TGCGGATCCA CAC7AAACGC CAGGCCAGTA 
GCTTGGCCCT TGTGGTACGG GTCCTGGCC7 AAGATCACCA CTTTAATATC CTCT3GATCG 34 0B: 
CAGCAGTGGG ACCACCACAT CAGC77G7CC TGTGGGGGAT ACAC7G7GG7 GG7TAGCCTA 3414 Z 
AGTTCCCGAA T CTGTCTGAG CAGZGAGAGC AGTTTCTGTT TCAGAAAT3A 7GAGAGGC7C 
AGAAAGGAAA TCCACTTAGG TGCCAGTAAC AGATCCCGG7 CG7C2ACCCC C7GAC7GA7G 3426: 
GATAGGGTGC CCC7AAAGAC C37C7G77GC AACCA73CG7 CCA7377GAA C77A777TCC 34320 
C777TGACC7 GCGTGCGC7C 7CCGGC7GC7 GC7777AGCC CGAG7C7GAC 772C2-C7AAC 

, -> - -t- a -" CG77G7GAA7 3444 Z 

AGAACCTGTC CGG7TCA7GG CC777C2-,w ^ A J 

AGAGC7A7C7 GCAGTGGTCG CG77AAAACC 7ACAG7A7AG GCCG7CAAAC 772G77G7AA 34=0, 

A7ACCACAAC AACC7CAGG7 TT7CCTGCGA CGCCCAGGAC C7CAATC772 GAAC3A2CGC 34 56 0 

GAC7AAAAA7 GACCTCAGA7 7AAACCCA77 CACGCA7G7T T:^:^— . 3^c-- 
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TTTGC7TCGC AGC7TGGCTA T AC AG AC 3 CC GTTGCAG7GA TTCGGA7CGG CGAAGTGGAT 34 65: 

AGAGTGGACC GCAAAGAACA ACGGCAGGGT AGAGGCTGCC GA7GCCTGAA TTGCGCAACA 34 ^4 C 

TGGTAAGGCG ACGTATGCGT^GAGATGTGAC CAATAGGGTG GTCCACAGGA CGGCAAATAG 

CGCAAAGATC CCCATGGGGC AAATCCGGGT TTCACCCTTG TGTTGCCTGG TTCGGTGCTC 34 6 

CCCAGGGAGC CCCCTTCCGT AATATCTGTT TTATATAGTG AGGGTTCACG CATGCGCGAG 

TCCCGACTAA TGAGGACAAT T A CTG AAATT GACCTTTTCG CGACACGGGG GTGAGGTCTA 

TTTCCCACGA CATACTTCCG CGGAAAAATA CCCACGCTCC TTAATTTCCG TGGGAAGACG 

ATGGGGGAAA TGTGG CATTA CCTGACACGG TTTCAATCAT ACTCATCGTC GGAGCTGTCA 

{2) INFORMATION FOR SEQ ID NO: 19: 

{ ■■ \ SEQUENCE CHARACTERISTICS : 

{A: LENGTH: 35100 base pairs 
{ E ) TYPE: nucleic acid 
(C> STRANDEDNESS : double 
{D ) TOPOLOGY: linear 
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MOLECULE TYPE: DNA ; genomic) 



ixi> SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
CACGTCTGGC TGAGATTTTC TAAAAAGTCA TCCAATGAAT CATCGGAATC ATCAGCACAC 
TCTAGAACTA CTCCATATGC CGGGGTGCGC GGGGGTCCCG AGTAGTGCAC GTCGCCATCG 12 0 

GGAGACACAG ATGATGGGTT TG AAATGT C Z ATACGGGCCG TGTG CACAAG GGTCACGTCC 1B0 
C-AT'—CCAA CACAAGGACC TTTAGATACC CTCTCCCGGC ATGTGCGCGT ATCCGGGCAA 24 0 

GCAAGCTGGT GT T CTG G ATT CCAAACGTGC CCAGCGGTAC CCAAAATCGC CAGGG2GTGT 
TTTATTATTT CCACAGGAAC CGGTTTCTCT AATTGCATCA 2CAGGGTATC CAAAAGCCGG 
GCTTCCACGT TGATCCGGCT TACCGACAGT TCTTTCCAGG GTTTCCTGG7 GGGGCGCGGC 
AG C TG A CT C A AAAAGGTCA2 7GCCTCTGC2 CATGGGCGGG TGGGTGACAG TCCGCCATAC 
TCTTCCAGGA CACTGGCCAT GCAT3ACTCC AACCGTCTCA CGTCCGAGGT AATGTGCTCT 
ATGAAGATGT GGTAGAGCCA GCAGACGTTC AAACACGATG AAATCAAGCT AAGCTCCCGC 
CGGAACTCCA CATC C AC AAA GGGG7ATTG C TCCGGTGTCT GTATTAGGTC TGGAATAGAA 
AACTCAGAAA AAGACACTGA CCCAZCAAGG AGAACCTGGC GTCTTGCAAA GTTGATGAGC 
C C C G C AG AAA GAATG7GTCT CCCGTGGGAC AAAGAGCTTG GGGGGGCAGA GATGGCGCTA 
CAG7GGGTGA TTTCTTCTAC 2ACGGTCATA CATTGGTGGC ACCCACAGGC CTGTTCCAGT 64 0 

ATCAGCATAA AT C T AT CTTT GCAGTCATCC CAGATCAAAG TCATGTCAGA TGCTGTTGCC SCO 
TGGCATTTTG CCCGCATGTA CATTTCCTGT CCCACATATT TTAACATCT3 TAATACTGGA 96 0 

AGTAGATTCA GTCTGGTGTT GAGCCCCCCC GGGGAAGCCA GCGTATGCTT CAGGACCACC 1C20 
AGGGACGCTA AGAACCCCGG GTGTCCGCGC TCCGGAAACA GACCTCTGAG AATACGCTCG 
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A3GGTC7CCC 
GGCAGC7ACC 
CG7CCCC777 



:tctctct. 



GGCC7CT2GG 



A77A7GAAAG 



GG7ACCGAAT GCCACAATC7 GTGCCC7CCA GC7CTCACAA 
AAT7GGGATA CACACCTCCA TG77CAG7CA CA7GTAC3C7 
CCATAGGACC CAG CTACAG C TTA7CC7CCA C7AAA7V* 
TAAGCCCCGC C CAG AAA C C A. GTAG GTGGGT GGCAA7GACA 
CCTTACTGGG CAAGGGGTAG TC7G77G7GA G AA7AC7 G7 Z 
GCAAGATGAC AAGG7AAAGA 7CGACC7777 TATTGTA7A 2 
TGGTGTAGGT GGGAG CAG AG TTCGCCAAGC 7C7ACG7CCG 
CTTATTAAGT GTTCGGTGTA CTTGACCAAA GCCGCGGAAC 
TGGTACCAGG CAAAAAAGGA 7CGGGCGG7G C7777CAGGA 
A7G7GGACAA GC7TCTG CTC GTAAATGCAC CGC7GG7ACA 
AAAAAACAAA GGTTCAGC7G CACGTTAAAA 7C7G7A7CCT 
G777C7ACCA AGAAAAACTT 7777ACCACG C7GGCCA7CC 
G7CCCG7TG7 GCG77G77AG GATATCGGT^ Aw* ± -vaw*«- 
ACAAAA7GGG AGAGGCACCA C7C7G7GCAG 7CCGCGG7C7 
GCCG7G7GGG GG7A77GGAG AG7CAAAAC7 C7GGGCAG7 Z 
AAAC C7A7G Z AGCCAGCG7C CAC7AG7GGC AGCA7GCCG7 
TCG77GCGAA G777G7ACAA CTGC7G CAGG GAA7AAGCCA 
ACCAGG7ACG GC7CGC777G 7CGG7GC7GG ACCAA7ATC7 
AGGG7CT7C7 CAACG777AG AG CGGG7ACG 7GGCAG7C7G 
AGGG7A7C7A AC7CC7GAAG 7A7C7GA7CC CAGGACGGGT 
77GAACAGG7 GA7C777AAG GGGCC77CTC GA7G7CA77G 
TC7C7CC7TA GGGTAAGAAG C77CGGCGG7 CC7GTG7GGA 
ACGAAC7GAA GGCCCAAC7C 7ACCAG737G 7GC7CC77A7 
"G7AGGA7CG CAG7GACC7A AAT AG A G 7 G G 7GGAAGA7G7 
AATG77CCAA GC7TGGTGCG CTATGTGG7C 7G777ACAGA 
7C7GC77777 7CG7GCC7C7 CGAATGAGGA CCAAAGGCGC 
GCGCAGAGGC A7CCCAAGGC A77A77C3GA 7CC7CACGGA 
AAAAGGCA77 TC7GACAG ZZ GCA7GCAGC3 GG37GAGCC7 
TACTACACGA AA7ATACAC7 GAAATGAAGG CCAAA7GC-. 

TCTGCAA7CG GAGGCCCA77 A7GATA77AA CCT-- 

ACGA7ACCGC C3GGC7GC7C 7CTGA3CAG7 CCAGGGCCC7 
CGG7CTACC7 TCCGAGGA77 ATGGCGCCGC TGGAGA7CA7 
C7GAAAAC77 77 AC AG CATC ACCGG77C7G C7GAGAAACG 
7GG AC7G7CC7A7 CCAGGAAGCG G7C7CA7GCC 
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AGAATC777A A77TTGCCAA TCC7GGAGCC AGGAC7G77G CCGGCTTCCA TGG7AGACC7 3 1SC 

CAGCGATGTG CTGGCAAAAC CCGCCGTTAT 7CTGAGCGCC CC7GCCC7GA GCCAG777G7 3 24 : 

CATTAGCAAA CCCCA7CCCA ACA7GCCGCA CACCG7CAGC A7CA7CCCCT 77AACCCA7C 2 2ZZ 

GGG7ACAGAC CCGGCGT7TA 77AGTACG7G GCAGGCCGCG 7CACAGAA7A 7GG7G7ACAA 2 26 Z 

CACATCCACC GCGCCCT7AA AAC CGGC CAC CGGTAGTTCA CAGACGG7G7 CAG7CAAGG C 342 0 

GG7TGC7-AA GGGGCCG7GA 77AC7GCGAC AACGGTGCCG CAGGCAATGC CAGCGCGGGG 34 8 0 

7ACCGGAGGG GAG77GCCTG 7AA7G7CAGC G7CCAC7CC7 GCAAGAGATC AGG7 CGC7G Z 3 54C 

ATGTTTTG7C GCAGAGAACA CCGG AG A77C 7CCCGACAAC CCGAGC7C77 TCC7GACGTC 
A7G7CACCCT 7C-CGATCCGA ACACGG77A7 AG7GGCCCAG CAA77TCAAC CACCGCAATG 
CG77ACG77G 77G CAGGT7A CC7G7GCCCC C7CT7CGACA CCACZZZCCG A77CAACAG7 3 72C 

CCGGGCCCCG G7GG7GCAG7 TGCCAACAG7 AG7CCC7C7G CCGGCCAGCG CGTTZZTCCZ 3 78C 

GGCGCTCGCC C AAC CAGAAG CC7CGGGCGA AGAGCTTCCG GGCGG7CATG ACGGAGACCA 
AGG7GTGCCG TG7AGAGA77 CAACGGCGGC GGCTACGGCG GCAGAGGCGA CAACACCCAA 
ACGAAAGCAG AGAAGCAAAG AGAGGAGC7C AAAGAAGCG7 AAG G CTTTG A CCG7GCCAGA 
AGCCGACACC ACGCCA7CGA CCACGACACC TGG7 AC CT CT 77GGGA7CAA 7TACCACCCC 
CCAGGATGTG CACGCCACGG A7GTCGCCAC GTC7GAGGGA CCA7CGGAGG CACAACCCCC 
GC7ACTG7CG TTACCCCCGC CACTGGACG7 AG AT C AG A GT CTATTCGCCC TGTTAGACGA 
AGCGGGCCCT GAAACATGGG ATGTCGGGTC GCCTCTCTCC CCCAC7GACG ACGCGCTG77 4200 
GTCCAGTA77 CTGCAAGGAC TGTACCAGCT GGACACGCCA CCGCCTCTGC GGTCACCCTC 4 26 0 

CCZZGZTrZZ 77 CGGC CCGG AG7C7CCGGC GGA7A7ACCG TCACCTTCTG G7GGAGAG7A 
TACGCAACTG CAACCGGTCA GGGCGACCTC GGCGACGCCC GC7AACGAGG "AC AG GAG T C 
CGGCACACTG 7ACCAGC7GC ACCAATGGCG TAATTACTTC CGAGACTGAA GTGTTCGCAA 
GGGCGTCTGT GCCTGCGTTA ACTTCCCAGG CAG777A77T 77AACAGTT7 GG7G CAAAG7 450C 
GGAG7TAACC 7ACAGA77C7 ACT7AAAATA GC7CA7777C 
GACTA777GT GAAACAATAA TGATTAAAGG GGGTGGTATT TCC7CCG77G 
CCTGGCGTGT AAACG7G7AA CCCT3CCAAA 7GCCCAGAAT GAAG GACATA CCTACTAAGA 
G77CCCCGGC- AACGGACAAT TCTGAGAAAG ATGAAGCTGT CA77GAGGAA GATCTAAGCC 
TCAACGGGCA ACCA7T77TT ACGGACAATA CTGACGGTGG GGAAAACGAA GTC7CTTGGA 
CAAGCTC3CT G77G7CAACC TACGTAGG7T GCCAGCCCCC GGCCA7ACCG GTCTG7GAAA 
CGGTCATTGA CC77ACAGCG CCT7CCCAAA G7GGCGCGCC CGGTGACGAA CAT CTG C CAT 
GCTCACTGAA TGCAGAAACT AAATTCCACA TCCCCGATCC 77CC7G3ACG CTCTCTCACA 
CACCACCAAG AGGACCACAC A777CGCAAC AGCTTCCAAC TCGCAGATCC AAGAGGCGAC 5 04 0 

TACATAGAAA GTTTGAAGAG GAACGCTTAT GCACTAAGGC CAAACAGGGC G CAGGTCG CC 
CCG7GCCTGC G7C7G7AG77 AAGG7AGGGA ACATCACCCC CCATTATGG G G AAG AA CTG A 
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CAA3333TGA CGCC3TCCCA 3CCGCCCC7A TAACACCCCC CTCCCC3C3C STTCAACGCC 
CAGCACAGGG GACACA7G7G C7G7T7TG7G C7G77T7TG7 G7C777AAAG GG3GAA37A. 
G7GA7GAGTG ACA7TCTGCG AC 3 CGAAA3 C AAGGGAGA7A GGGCCGGG7G 7GA7GGAAAG 
CA7A3A3AAG AGAG37GCAG CAGGTATAGA CG33AAACAG G7G777A7G7 7GGG3GGG7G 
~77AG7=AAA TGGGAAGAA7 GGG3CCAGC7 7G37G7G7T7 G7AGGGA77A GAA3AAAAGC 
A7G3AGAAG7 ATG77TC37A GCGGCGAGAT TG3AGGCAGA 7AAGGAACAG A77A7777G3 
TTCGCGACAT GCTGATGCGA A7G7GCCAGC AGCCAGCG7C GGCAA3GGAG GCGCCACT=C 
GAGGA7377G AAGG77GGT7 G73GCG7CG7 C GGGGAGAAG GA7GCCAGAG 777G7G7G3T 
AAGAAGGAA7 7GT7ATCGGG CAG CAATAT7 AAAGGGAG3C AAGT7AA733 377AA73373 = 70: 

TGGGATTAAT AA3CA73AG7 TCCACACAGA 77CGCACAGA AA7GCG7G7G GCGG7G37AA 
TCG7A7GGG7 TTG73TGG7G GGGTGGGA7G CCAA77G7CC CACG7A7G37 7CGCA777GG 
GR _ T _ G -- ;> AGA3 GG77GG AG7GGACAGG 777A7CAGGA C7GGG7AGGC AGGA7GAAG7 
GTTCCTACGA GAA7A7GACG GG337AGAGG GCG7373GC7 AAACGGGAGG AGAG7AGGAG 
G73GA7G73C G73GAG7GAG 7ATCGAAA7G TG7CGGTA7C 7377GAAGA7 A337G73337 
G73GG737GG AGAAGA7GCA ATAGA7GAA7 GGGGG7CGGG GGAGGAAGAG CG7CGG373A 
C ^_„„_ 3T GJ , C _ T - TATG acacAAAGCG TCCAGGCCAC CACAGAAC73 AGCGA733G7 
7AA7A73AG3 377773A3G7 G7A77A3AGG 77T3AAG7G7 AATGGG7333 AATT333TAA 6!80 
A33GTGGG7G 7G7AGGGA7A AAGGG7AACC 77AG37TG73 TCTCATCTAC AGGA7GA7A7- 624 0 

73A737GGG3 AA33A73GAG GA3GA3GGGA A77GGCG7A7 GAGCG37G33 AGAAAAC33C 6 300 

AGAAA7AG7G G733TAG7AA G3G7G7GG3A TTTTCT3CCA 33A37ACAAG GAG7AGAG3A 
A3A3AG333C A37AGAA73C A3AAA7ACGG ACG3A7G777 AGA7A37A7C- GGC7GTG3G7 
____„-- _ AC -.- STC;:T TA7A3777AC CTG7GCG7TC CA35ATGCG3 GCG7AA3AAA 
3G37ACA7AG TG7AACACAA AAG3A7AAAA G7AAA7AAAC G7G777A773 77GACA7GA7 6S4C 
AAAGAG7G37 AG73777A37 GG777GGGGG 77GGG77G73 GGG7G373GC 7G37CCGCGG 6 6 00 

7TCAG7CA7C AAC33G3GG3 CG7G77G7CG AG3CTC=TC7 7CG7CGCC7G 77A77GG3AG 6660 
CAGGAGGCGG TTTAGCGG7G 3333=37373 A3A7G 3AGAC G7CGA7T37A AGCGAAAG7G 6 73 0 

c; ._„_- sc AT33733A37 7G3777T373 77A3AA3377 3C7GAA7A77 G73373AG33 6760 
7G33773GA7 777377A33G G333C3G3AC 7GAG7GGA33 3A3AG7AG3G 37AAG373CG 6B,0 
c ___„_- t: 3G7GG33G7C AGA3GC3GA7 3737G3GA7G G33AG733A7 333A.7G377 6 900 

„„ r~~, A733 6 96 0 

7CC3AA337C C3GA7T37CC A3AG73AA77 .—ova-**.— 

TAAG373C77 777GGG7C7G C3C37GGG3G CGG33A7G73 AGG7AC333T A3A737A337 7 0=0 

GTTGGTGA73 37CA3AACAA AAG333AAAT 3C3733777A 7A3C3A3377 7AAA7A3777 7 U 60 

A7TGAAAAAC 3A7AG37T7C GTCAG3G377 G7GC3AG7AA 73A3A73C3A 3737A733A7 7140 

G3A33A3C7C 3733ACAAA3 773AAAAA/.C AAAGA7A7A3 CAGA7A3AAA AA7G7G33CA 7300 
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CGACGACTAG TAACGCGTTA ATCAAGGCCC AGACGCTAGA AAAGCTAGAA AGGGAGGGGC 72.SC 
TAAAACTA7C CGCGGAACAA G CAACGT CAT AGAATCCTGG GGTAGTGACT GATGTGGGAC 
CGGGCGAAGG CCTGGCGCTG AGCCCAGCCG TACTGGGACT AGAAC3CTCT GTAGA7GA7G 
CGACA7CTG7 CGAGTTGGCC GTAACCCAGC AG TG AC CT AG TATCGAGGCC ACAAATAAAG 
CCAGGGCCAC CGTGGACGC7 GTCATTATGA ACAACCGCCG AGGCTCCAAG CCGTCTATCC 
AACGTTCCGC G7TCGCCTC7 TAT ATA C ACT CTGCAATGCA GTCCGACTCT GCCCCTCTAC 756 0 

CCAGGGTGGA ATATGTGTTC GAAACAAGCA AATTTAGAAT GACGTCGAGA GCAAATGAAG 7£2C 
CCAGACTCAG ACTGACAAA7 GAGTGTCCGA TACTGGTGAG ACCCCACGAG CCG7TCA7CA 76 S: 

TGCCCACCGG AATACAC7TC ACGCGAACCC CTAGCTG CGC T7TCATCCTG ACCGGAGAGA "4 0 

CCGACAAGGA TG7ATTTTGC CACACGGGCC TAATCGACGG AG GCTAC CGC GGGGAGATAC 

AGQ ATTTT ACTCAACAAG AGGAAGTACC CTGTGACGCT GTATCGCGGG GAGCTCAACA 

TCTGCCTGTC TGCTTTCAAT TACGTGCTAC CTCCGTTGAG GGACGTATCA TTCTTAACCC 
CCCCTATGTA 7GCAAACGAC GCCGGATTTG ACGTGATGGT GATGCACTC7 ATGGTTATCC 
CTCCTACTAC 7GACCAACCG TTCATGATAT ATCTAGGAGT GGAGACCCCA GGCCCCCCTG 
. AACCCCACG7 GGCTCTAGCA TTGGGGCGAT CCGG7CTAGC AT CT AG G GG 7 ATAG77ATAG 8100 
ACG77AG7GA GTGGGGACCG CGAGGAT7GC AGCTGAAGT7 TTATAACTAC TCGGGGCAGC 816 0 

CGTGGCT3GC GCAGCCCGG7 AG C CGC AT AT GCCAGATTGT G7TTGTGGAA CGCAGACACA 
TCCTCAAGGG C77CAAAAAG TGC7TGCGCC ATAGGAAGCT AGCTCCTGGC G7CCG7TTCC 
GC-GAGGC7CG AG7GCATTT7 CGCGAGGATA CAAATAGCGT CCGAAAACA7 ACCCACGAAG 
ACAACCCCC-T CCACGAACCC AACGTAGCCA CCGCTTCCGC TGACATTCGT GGAACCAAGG 
GGCTGGGGTC GTCTGGGTT7 TAGAGC2GCC GCCAAATGCG GCCAGT77A7 TAGGGCGATT 5460 
CGATCCCGCA ACCCACAGCA T CCCCC AAAT AAAAAAACGA GTGTACA CAG CCAATG77T7 ES2 0 

TATTAT7GTT CGATTCATTA CTGGTACCAG AG AA7AAAG C CAACC7ATGT CGAACCTATC 
GCGCTTTCTG TCGTCTCTTC CAGGGTTGAC GAAGGCCGGG GAGGGATTGA CGAAT3CATC 
GCGGAAACGG ACGGGTCTTC GGTGGGTGGC TTGGGTAAAG TTG CCTCCGG CTGGCGCGTA 
ACGGCAGGCG 7GAGAGGCAA TACAGAAG7G GGTTCCGACA AGGAGTGGC7 G AT C 7 CAG AG 
G C C CAT ATT A CCGAGTCGTC TGACGCCATA GCAGTCGCCA GTTTTTCCAT CTCCATGAGC 
GAAACGCATT CCCCGGCCCT TTTGTTTAAG AGGGACTGGA GCGCACTGTC GTCCACGGTA 
ATCTCGCCGA CCGCCAAGGC CAG CATTGTG TTCCA7ACGA CCTTCTGAA7 AGACTGCAG7 8 94 0 

T~-ZZ^CZ:Z GGGTTTTCAC GGTCTCCTGG CAGCCCGCCG GAATTT TAGC CACGTCAAAA 9 0 0 0 

CGCTTCAGGT AGTCTGTGAT CTTGTTTGAC TGTACAG C C A GAAGGTAGGT CTGGTGCAGC 
GCCGTCGTGC CAAGGTTCGA CTGGACAACG TCACCCAGAC ACACTCCGGG GGGGAGGCCC 
AAAT CT AT CT CTTGCCGCCA GCGCTCTGGA CAGCCTTCCA GAGGGTCACC GAGCCGCTTC- 
TAAGCGTGGT TGCCGCGTCC AAAAAGGTT7 ATACCGCAAC ACGTCCAGGT GTACCATGGA 
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GACGACATAC CGCCGCGAGG CG=TGACAGT AAGG3T7AT7 777TGTACGA GTG3C3ACA2 
C3CC3AGACG ATCGCGGACG TCC77ACG3G GGCG3CAA3G TGAGCG73G7 7GT7T7C737 ?;c. 
ACTCCACGAC GTT7777A77 3 C 3 AG AT A 37 CGCCCCCAGG GTAAGG3TAA AA77G7G337 ^ZZ 
333333A3GG GG7G3TGG3A ACGG CACAAG GTGTTCGG3C GTGTTGG7CC TA337A37GA 94 £ ; 

CGCATCAGTG GCCTCGGGGT TCCTTGGCGG CCGGCCACTG GAGGCG7CCG ACAT7AAA7A 
TA7GGTGGT3 AG 3 G AC GAG A CCGCGGGGTT GTTCAAGCCG CTGTTGGAGA TAA733G7G3 
CGCGCG3G3A C3A73AAA7C AGGACGCG7G CACTTTCCAG AGCCAGGTGG CC7GGCTCAG 
AAC G AAAT77 G7TAGGGCA7 7GAGAAAA37 TTACAAGATG ACTCCCTCAC C37A37G3A7 
GCTG7CTG3A 77TGGCG37C AGGAAGC3CA GTT CGTCC7G ACCAGGT3AT 7C7A777777 
7GAACACAC7 GTGG737G7A 33A3AGAGAC AGTTTCTCAC C7GTCTAGAC 7G7777GGG3 
7CAACAGGGA 3AGACG3TGG 7TTCCGTTAC C AG C CA CG AG GAGCTGGGGC AGC7A7ACGG 
CACT7CCC37 773AGGCGG3 GCGTC3CCGC GTTCGTCGC7 TATGTAAAAG AG AAA7TAG 3 
GAGAGAGAGT C7GGAGA3GG AGGCCA7CGA CGGCACGATA GACCAGATCA GGGG3AAA37 
CA7GC73T37 AA33AGGAG3 TGGTCCATTT CAT AT AT AT C TCCTTT7ATC AG7GG373AA 
CAAACGGGCG 77CCTGCGCT A C7 37 AG AC A GACGTCCTCT 73AAG7GGT3 7 AAG G GAG C 7 
GGGG3AAGAC 33T3AAT73T G7GGCGCCC7 ACACGGGGAG TTTCGTGACC A33733A372 
C7A37AC3A3 AAAAAAA3C7 AC37ATC3AC 7TACATAGAC A773GG7ACG 7GGG7GG3G7 
A7TAG3AGA3 GG37A7777G 33GGGAG7G7 TGTAGGCGAG CGG7GCG777 AT7GG7GCGG 
GCAGTCAAAG GAGA3GG33A 33C7G77GGC 3AGCATTAGC CAACAGGTGC G33A337GAG 
G77GGAAAA3 GAG77GG373 33A7337AGA CG7GGCCGCA CTGCGAGG77 CCGATGACGG 
T 3 AG77TAAA 3AGGG33777 7G7C3GA3A3 7CAAG3337A 23337GTACA GGTG3GA3T7 
TC7GGG3AA3 3A3777773A 3AA733773A GG AAG ACGG C CTAGAG3GA7 A37333AG3A 
AAGTGTGATA 77733AG3CG A33A33A37G GGATATGTTA TCTGA3AAAG A3373AC37A 
CCGAAT7777 7A33A7GA33 73AGC37A73 G3TGC3AACA 3TGAAGGAAC AG37337737 
773AAGACA0 GAA7A37T3A A33373G377 GCCAGTGTA7 AGATGGG7A7 7AGAG77TGA 
CG7G3CCG73 7G33GG3A3A 773A3AGGAC ATTCGAGGAG G7G3AC7373 7G7377G773 
C3TG3373AG GCCA7A37CG A3A73A773A ACTC3773GA 33AG7GGA7C G733AA3A3A 
C3CA37A7A7 7777T3AAA7 3A33373733 A33GGACGAG 7GG3333G3G AAGAG37333 1032C 
CAG3A33A3C 77373TGG3T 373A73ACAA A3TGGG7A73 337A77A73G 7333377333 10 = BO 
A3AAGGA37A 7G3G733773 G373GGAG33 CATGGTGGCA 3T2A27GG 3A 7T37AAACAG 1104 Z 
GAC3ATAAAG 377GA723GG AG37GG733A 3AGA7733C3 T 3 AA7 A 3 AAA AAAA3GGGG3 
C33T77CGA3 T37GG3A7A7 A2GGCGGA3G ACGAAGCG73 G3G3TT3232 A37377A3AA 

, « — T---"-r;r — T A^GAAGATA C7A3737373 ZyZZZZZCCZZ 1112 0 

GGTGGGCTTA GTGGGwjnAu - - - ->w^<- - 1 

CAACGGCAAG G3G3AG7AG3 7333G3G3G3 37T7A33377 333GAA3733 ..^o- 
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CCCGGGCCAC AGCGCCGGTC ATGTCGGCCG AA7CATCTAT AGCATCA7GG ATCGCAATGA 
GAATTTTTTA GAAAACAAGA CCATTAGCTA TCTGCCGGCC AAAAT A C CT C ACATC7TTCA 
GCGGATAGAG ACCCTATCCG GTCGTTCAAT AGAGGACTGG CTACACTCGG C2ST7TG3GA 1146C 
TAAAGCATAC GACACTATAT GTAAATTTTT C C CAGATG AA AAAGCACAAC AGTTTTC7CA 215 2 C 
CGTTGCATTT ACGCAACAAG GGGAAAACAT CATCCAGTTA AGACCCCGTC AGGGAAGACA 
CTTCCTCTGC ATCAACCATA ATCATAAAAA CAAGTCAAAA ACAGTCCGTG TATTCCTTAC 
CCTTCATTCC ATTAGGGTGA GCGAAGTCAC GGTAACACTT ATGAGT C AG T GTTTTGCCAG 
CAAGTGTAAC AATAATGTTC CCACGGCCCA TTTTTCGT7T GTGGTACCAG TGGGACTGGC 
CAGTTAAT - C CACTATA7AA CCTGGCTGCC AGGT7CCCAA AATAGCCCGC GGCATACGGC 
TCACTTCCCC CCACATTCCC CCCGTGCACA ATATAAGAAC CAAAGGACAT GGTACAAGCA 
ATGATAGACA TGGACAT7AT GAAGGGCATC CTAGAGGGTA AGTCCTCGTC TACAACAGAC 
TTTTCCCATT TCTAACGTAT CGTGCTATCT TCGTCGCCCG GCGGACCATC 
-"A^TTATCG CGTTTGATAT TACAGACTCT G7GTCCTCCT CTGAGTTTGA CGAATCGAGG 
GACGACGAGA CGGACGCACC GACACTGGAA GACGAGCAAT TGTCCGAACC CGCCGAGCCT 
CCGGCAGACG AG CG CAT C CG TGGTACCCAG TCGGCCCAGG GAATCCCACC CCCTCCTGGGC 
CGCATCCCAA AAAAAT CTC A AGGTCGTTCT CAACTGCGCA GTGAGATCCA GTTTTGCTCC 
CCACTGTCTC GACCCAGGTC CCCCTCACC* GTAAACAGGT ACGG T AAAAA AATCAAGTTT 
GGAACCGCCG GTCAAAACAC ACGTCCTCCC CCTGAAAAGC GTCCTCGGCG CAGACCACGC 
GACCGCCTAC AATACGGCAG AACAACACGG GGCGGACAGT GTCGCG-CTGC ACCGAAGCGA 
GCGACCCGCC GTCCGCAGGT CAATTGCCAG CGGCAGGATG ACGACGTCAG ACAGGGTGTG 
TCTGACGCCG TAAAGAAACT CAGACTCCCT GCGAGCATGA TAATTGACGG TGAGAGCCCC 
CGCTTCGACG A CTC GAT CAT CCCCCGCCAC CATGGCGCAT GTTTCAATGT CTTCATTCCC 1260C 
GCCCCACCAT CCCACGTCCC GGAGGTGTTT ACGGACAGGG ATATCACCGC TCTCATAAGA 12660 
GCAGGGGGCA AAGACGACGA ACTCATAAAC AAAAAAATCA GCGCAAAAAA GATTGACCAC 
CTCCACAGAC AGATGCTGTC TTTT3TGACC AG CCGC CAT A ATCAAGCGTA CTGGGTGAGT 
TGCCGTCGAG AAACCGCAGC CGCCGGAGGC CTGCAAACGC TTGGGGCTTT CGTGGAGGAA 
CAAATGACGT GGGCCCAGAC GGTTGTGCGC CACGGGGGGT GGTTTGATGA GAAGGACATA 
GATATAATTT TGGACACCGC AATATTTGTC TGCAATGCGT TTGTTACCAG ATTTAGATTA 
CTTCATCTTT CCTGCGTTTT TGACAAGCAG AGCGAGCTAG CACTGATCAA ACAGGTGGCA 
TATTTGGTAG CGATGGGAAA CCGCTTAGTA GAGGCATGTA ACCTTCTTGG CGAGGTCAAG 
CTTAACTTCA GGGGAGGGCT GCTCTTGGCC TTTGTCCTAA CTATCCCAGG CATGCAGAGT 
CGCAGAAGTA TTTCTGCGCG CGGACAGGAG CTGTTTAGAA CACTTCTGGA ATACTACAGG 12200 
CCAGGGGATG TGATGGGGCT ACTAAACGTG ATAGTAATGG AA CAT C A C AG CTTGTGCAGA 1326C 
AACAGTGAAT GTGCAGCGGC AACCCGGGCC GCAATGGGGT CGGCCAAATT TAACAAGGGT 
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TTATTCTTTT ATCCAC7TTC T7AAGGAT7G CCAAACCCCA 7GGCAGAGTG TCTCCCG.A: 
T C CAT G T AAC TCACGTAGCC TTTCTCTAAT AAACAAGCTA CCTGCAAACT ATA 
GAAATGAGTC AGGCGTGGTC TCTTCTCTAC CGTGAATCGC ACCTTAAACA CAACAC 
CCGCCACCAG GTGGCACCCA ACATCCATTA TGGAAAAACC CCGCGCCACC TTCC3CCAC2- 13 5£ 
TG GAG C C AAC AAACAAGACA CACCCGCCAA TGTTTTGGTC TCTTTATTGA TATG ATATA C I3£2 
TCCCTCCCAT AACAATACGG TGTAGGCATT TTGTATTATT TATTGCATGC- CAT C C C ATAA 
CGGCTTCGGC ATTATTTCGA GTACGACGCA GGCGTCTGAG AAATTACTGC ACCTCGCCGC 
AAAGTCTCGC GGGGACGGGG CGTGGGGCTC TAACTTGCCA ACCGCCACCG GT 



12 "4 

13 = 1 

CCACAGCTTC ACCAAAGGAC ACGTCACGTG AGAGGGTGCT GGTAACGGTG AATTTGCCAA 13 5-2 
CCCCACCAGA AATGTATTCG GGTTAAATAT CCTCGTCGGT TTTCCCTGGG G C AG C AAG A 3 12 SI 
GGGGCCGGAG TCAGGCGGAA CGGTATTTCC AATAAAGTGC ACGGGCCCGT TATGATAACA 13 9E 



1429 
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TACG CAAAAT ATGCCATTAC AAGAGCTAGT CAGCAGAATG CCTTTTGCAC ATG C G T C C A 3 14 04 

C GT A T C G C AT AGCTCCCGCT TGGCTATCTC GCAGGCCAGG TTTGGCACAT TGGGT AG CCA liLG 

7A „ rT3G::c:c GGAGAC CCCA CTGCACAGTA ATGAACTGCG GGGTCCCTAC G CAAGGC CG A 14 15 

TGAGATTCGA CAGCCCGACT GGCTTGTCGT CAGTAACTCA TGAACCTGTT C 3 C C ATT AT A 1412 
ATA „ TCCT3 ataaaCAACC GACCCCAGTC AATGACGGCC TCCTGACCCT CTGCCGTCG7 
ACAAGATGGC ACGGGCGTTA CAATCTCGCC TGGCAAGCAC TGCCCCGGGC- AAAAAAATCC 
CTCTTGCAAG AGACGTGCCA TATTGTTAAA ATCGTGGACG GCTCCGGCCA CGACTCCACA 

TT C CAT G CAT TGTTCTTCCT CCGGTTTACG TACTCTAAAG AC CAGAAAAT GGTGTCCATC 144 £ 

CTGAGAAATG CCTTTGCCAA TCTCTTGTAA ACCCCGCGTC CTGCGTAGCG CGGCAAGCAT 14 = 2 

TCGCCTGCGC CCCCTGGTGC CTTTAAACGA GGCGTCCACG GGCATGTTAC CCCTTTCGC3 14 SB 

GATATACACA ACACCCAATT CCCC3TCTCT G CG C C ATT C A AAACA3GG3T CCGCGAGG32- 14 64 

CGTAACTGGT ATACGGAAG C GGGTGCGCTC TTCGTCTTCC CACTCTACT3 C3GGAAATT7 2470 

TCCACTGTTG ACTTGACATA CTATCCAATC CTTGATTGAC GCTTTCCCCT CACTGGCAC1 24" 

G G T AG AT ATT CTTAGTTGTC GTGTCCGGCT CCACTCCGTT ATCGCAGCCA CCACAGCCTC I4S2 

CCGTGTAATA TCGCCTGCGG CTGCAGAACC CCCGGTCCCG GAGG3TCCTT CTCCC3GTGA 14 88 

CTCCGACCTG GATGGTTCAT CGCAAGGA GC CCCGGAGCCA GAT CTTCCCG GTGACCCTT2- 14 94 

TGACAAACAA G3TTTTTTGG GTATCGCCCC AGGCGCCCCA AAAGG3T7CC- GTCTTTGGCC 15 CO 

1512 
2E16 



TGGGTCCA 

TATGCCGGGG AGCCACCGGC CAT C AGATAT AGAGAGGCGA CAGGC7CTCT AT AT AT C AC 
GCTAGGTGGC TG AC AT ATT A 3TGGGCCTAG CCGCAGAATT GCCT33GTAC- T 



GCGTTTCTCA AATTAACCGA AACTACATTT TTCTATTTTA AG7ACGGGA7 ACAAAGCAG3 1S24 
AAGAGCACTC TACCTATTAA CTC 3 TG GAG A AAC AT CAT AC AAAATCTG7A CATTATTTTT 1 = 26 
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AATACTTTAA TTTGTGOAGG TTTCTTCACC CCACACCTGC TTTTTGTCTG G T A C AAAAAA 
CCACTGCAGG GTCCCGCCTA TAG C CAACT C CTAAGCGGGT TTTTTG CTAA AGCACTTTTT 0 54 SO 
7AGACT3TCC CAGAAACCAC AT AG CTTC CT TTT 2 ACT CAT TTGAAAAACA GCCCCGCCCA 1 = 541 
ACTGCCTGGA GAATTTTCCA CCCCCTCTAC CATTTCGCGC CTTTACCGCT GGTGCGAAAT 15= C: 
CTAGCCATCC TAT C A C CG CG GATCCGCTGG AC CAATATAC CACGCCCACT TTTCGTAAT2 
AGCAACCCTC TACGCCTACA CCCCTATGAC TGAATATAAC CCCCAACAAG GCTATGAAAT 
CATGAATGGT AACTGTCTGG ACACCAATCT TCCGCGGGGT GGCGGCAGTG CGACGCAAGT 
ATCCACAATA AATGGTGCAA TAATTGGCGA AATGTCGTGT CTGGTTTATT TGGACTACAA 15 6 4 C 
G ATT AC AT C Z GGTTTTATAA TTC A CATAT A TGATCAATGT AGACTATCCC AAATGGAGCC 1592 C 
TATAAAAATT TTAACAGTCA AGGGTACATT TTGGAAATTT TCTGTAGATG CCGGGGATGC 
GCCGAAAAAT ACCGTCCCGC ACGTCACTGG GTTGACGCTC AGCGGTGTCT GTGGGATTGC 
GGCTGTGGTT GCCAGGTATC GCGCGGTGTT GAACAGCTGC TGCGGAACTC TGGGGCTAAA 
GCTTCGGAGG ATGCGTTCAT AGCGGGAATT TGGATTACCA AACCACCAGC CTTCCACTTG 
AGTGGCGTTT CTGGAGTATA TTCCAGACAT CG AG C AAAAT ATTGGGAATC CGTGGCCAAG 
GCCTTCAAAA ACT 2 3 3 TTC A AAAT CT C CAT TTGCT2GGGT GAGGGGACTG TAAGACGCGG 
TATGCGAA3C AGTTCTGGTA CGAAACTCTG ACATAGGTGC CCCAACCTAT CCCCAACAGG 
C "AG CTA CAT AACATTGCCT CGCCCGCGTC ACCTTCGCGT CTCAGAGTTC CACGAAGGTT 16 3 80 
CCCATACACA AAGATTTCCA CAACAAAAGA CACCCGCTGA CTATCAGGGG GAT C AAAAAA 1644 C 
GGTGGCTTTT CGGGACCGGA GTGGCTAACG GGCGTACGCC GCCCGTGCGG 15 50C 
GGACCTGGA2 CTCGGGCGCC GCCTATCCGT GGCCTGTCTG GTTGAGGAGC TCG3ZTZZTZ 
CTGCAGC7CA GACAAAATGT TACCCAACCC TTCTTCC2AC GTACATATAT CCTCTCCTTG 
AAGGTTCGAG AGCGTAAGAG GGAGACCCAA AGGCGGCGGC ACTAAAGATT GTT2TGGTCC 
ATAACCC222 ACTGCATATC TATCTCCA3C ATATGTACTA ACAAGTGGAA CTCTGGGCCT 
TTCGCCACTA CCCGGGCACA CA2ACTCCCG CCGCTCCAGC TCTGTCGGTA AATGCGAAAC 
CTCGGGGTT2 ACAGCGGGCT CCGGTGCAGA ATAAAGCACC GTAGGTTGGA AAACGCGCGG 
CCCACTGACA GGTAGGGGCG TGGATGCTAC AGTGGTAGAT GGGGTAT CGG AATCCCCAGT 16 9=0 
GAGGT2AATA AT2TCCACTT CGAGGGCACC AGAACTAGTT GTCACG2GTC TGTATCCAGT 1698C 
CGCCATGTTG TCCCCCTGGC AGAC3TACGG TATTCCAGAC GAGGATGGCT CCTGTCGCTC 
T3CCACCTCT G3GGTGGGTG GT3C3CCG3C G3AGGGCGTG GCCGAC3C3C CAC2CTGCGT 
GTGGGAAAGA CCCTGGTTTG GAGCGCCTCC ACTAGAC 2AC GGAATCCAAA 3 2GGTGTG CG 17160 
AACTT2CG3C AC2ACGGCGT GACCAACTGG TGGGTGCCAA ACAGGCGC3C GTATGGGTCG 1720 : 
CGTAGCTG32 GGTTCTGCCA ATGGACTCCA ATTGTAACAT GATGGTTTCG CATACCCGGG 
CGC3GGGG CG CTGGGCGGTT GAG3TTCGAA GGGATACA2C CGCTCACTC3 CAG 2ACCCTG 
AGGAGCCCGG C2TTCTGTAG ATGCCCCGCA AGCGCCTTCG G2ACCGGTTT CC33GCGGG3 
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AAGCCACGCG CGAGCACATT GGCCGCTTTG GGGGAGCAAT CCCTG73GCG CCAGAGG7G3 1'-: 
ACCCTGGCTG AACTCACCGA CAAA7GTTCC CGC7TGGGCG T3CGGCGGAA 7CCAAC7GGG 1~:: 
GGCAGCAGGA TTCA3CTGGC TGCTAGGAAT CCCCG7ATAT GTCCAACGGG GGGAAAGG 3 3 
ATCAAATTGG CCCGTGGTTG GCGGATGCAC TTTCTCCGGG AGACCAGACG C3CCC7GAGG 
CCACCA7CCC 37GACAGGAA GATCTCCCCA TGGAAAACAC GCAGGTATC3 ACGGGGACG7 
AGATGGCAGC CT AG AC C CAT CGCGCATGGG AGGGGCTAGT TGCCCCGTAT CCCCCGGCG7 IT75C 
CTGTGCGACG CCGGAGACCC CTGACACAGT ACCGGCAAGC CGTGTTTCG7 GC7GCGGC77 17620 
GGGCGGC3CC 3TGCCCGGTA GGCCTGCACC AGATGAGTGA GGGTCTGAAG GGCCGGTCAG 
CGT7GA7G3A GCAGGCGGA7 CTCCGGGAAC CCGCCACGTA AAGGACGAGG CCTGCG7AA2" 
TTGTCGCGTC CCAGAGGACC CCATACCTGA GGTAGATGCG CCCTCATTCA CTGG7A7CCA 1SOOC 
CACGGAGCAG GCAGCCTTC7 GTTCAGTCGT TATATCGCCA ACAT7GTAA7 AGCGG77CGA 1806C 
777CCGAGGG CGACCCCTCA GCCCCGATGG CGCCTTAGGG GGAGCAGGTG CTGCAGCCCC 
TGCCTCCTZG TAG CTTTGT7 C7C7AAG7AA AAGGCACGAG AG7TAACGTG GT7AGGG7AC 
CTAAAGTA77 TCCCGCCGAC ACCAACGCAT CAAACCTCAC ACCCCCTTCC CCGAG77ACA 
TACC7AGTG7 CACTGCGTCG CGTAGCCGTG GTTTGCATTG GGGGGGACAA CAGACACTGA 
ATAAA7CGCT GCAG7TTTTC AGGAC CAT AC GCGGCCCCAT AGCAATACGT ACA3TTTTTA 
AACGGCG77C GCACCAACTG CCA7ACTACG TAGCTACCAC CAAATG73TC GC7G7ACCG7 
AAATC377CC GCACGACGGC CCTCCTGG7T CCACGCAA2A G7CTC2CAAA ACG7CCA7AC 1B4SC 
ACCG7C7GTC CCACGACAGG CGA7GGTCCG TAGACTCTAT CACAC7CC7C A7CAAA7GCA lES-iC 
TGG7ACACCG AATACCAGCC AGG CGGGA7A TCGCTGCCGG C AG G C AG G G G CGCGGGGGC7 18600 
GZAAAAAGAA GGTTGTTCCT ATCAAACCAG GAAAAATAGG GAAACTTA77 G7777 CAAGG 15650 
G2A7CAA7AA TCCATAACGT GGCCCA77CT GAGCCACCGG C777 AG 3 CAT G3TCC3ACAC 187=: 
AGAAACCGAT CGGCG7TCG7 C77TGAGGCA CAGTCCCGAC TGA3CC7TAT AGT32CCCC3 
TTCTTGCTAT GAAAAAAACC CACGACCG7T ACGCAAAT77 GAGGAGCTAC 7CACC7AAAA 
G7AGCTCC77 TGACAAA7G7 CCTGG77T7A TACCAA7TGT 7CACAA7GAC A7A773T3C . 
GGCGGAAACA GGTG7CCCGA T3TATCCT" GCAAGTAAG2 ACCATTACCA T3TG=CATCA 
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ACGCACGCAG CA7AAGACCC GA3CCAGTCG 15C2C 

CGCCCTCCA7 CGCGCCTGCG AA7777CCCA CCACCCAA7A 77G7GGCA3A TC777C77A7 190b v 

G7ATA7G7GG 7TACAAACAC CACGCCCC7T AAGCTG7CC7 CTCTCCCAAG GGGAC.^.. _9.-,~ 
TATAACAGTG A 3 A7A CG AAA C3GAGACGC7 C73AAA7GCT T7C7A7777A 777A7C3A7. 

CCG337TAAC A7AA7CACAG GTAGC7ATAA AATCCCCAT3 C7C77GAC37 G37AACCC7G 1326C 

GrCTGAGGTT 7CCTCTGTTA TCAAACAAAC CTGACCACAA CTGTACAGAG AAAAG7GGGT 19 2.. 

3AAATG7AGT GTTTATTTTA TCC73ACACT 773AC7TAAC CACAGCCCG7 CAAACCACAG 1925C 

GGAC3CTGTT GGCTGACTA7 TAGTCATCAC ATG7AAC7GA ACGCAA7C7G AGC7.3A7GA 154<: 
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CGAGGGGGAC CAT AT CG AAC TGTTCTGCCG ACGTTGGGTC ACCTCCGATG AACACAGTTG 
TTTTTTTAAT GTGCTCATGT CCCT3TATGC GATATTGTGC CACATTAAAA ACA7CCAGAA 
CAGCCCTAGA TGACAGTCCG CAGATCACAC CAAACTTCTT TGGAG GATTA TTT C CAT 3 AT 
ATAATACGGT AGACTTGCAC AAATTCTTAA CATAAATGCC AG AT CG G AG A GAAACTATCA 
CAAGACCCGA AG C AAA CG AG CGCAGCACGG CCGCCAGCAG GTTAACGTCT CCTGGCCCT3 
TGTTATTGTC GTCAGGTTTG GGCAACAAAA CTCTTAACCC TTTGCGCGAA. TGCAAGCAAG 
AGTGGCTAAT GTCTGCCAGT GGGTTCTGGG AACATAGAAT AAACACCTTT CGTTCCACTT 196SO 
CCAAAGACAT TGCAGGGCGG CCAAAATAAA ACACTTCCAC ACCAAGCCTA TCGGTTATCA 1 9 52: 
TTACTGGCGG CCGTGCCACT CTATAATATG CGGATCTAAG CTTCCTGTGG CGAATGCGC2 1?9S0 
TCGTGGTAGG CCTCTCGTGT CTCCGTGGCC CATCATCCCA TAAAAATTCG CCAACAACTG 
GCCGGCGTCT GGACGCCGGC GGCAGTCCAG CACCATCATC GACTT CTTCG TCACTTATCT 
CCAACACATA TTCCCCTGCT ACATTCTGCG CCTCGAGTGC CCCAGCTAAG TACACATCC7 

« TA r A , CCCGACAGCC GAGGCGGCGA TTG AG CCZTC ' TG TT AC C A CG CCGCTTGCAT 

CCGT^TCGCZ TCCGGGCTGT GATGTTGCGA TAACATCCTC TGGGATGCCA AGCAGATCAA 
AGAGGTCTTC ATCGCACATC GCCCTCATTA GCATGTCCAT CTCCTGTCCC ACGTGGTACA 
TCAATGCACA TGCAGATTCT TTATCAAGCA GTGTGAGGTC ATCTTCAACG TTGTCTGTG7 2 04 00 
GCAC3GTT3T TTCATCGGCC GGGGGGGGCT GCGAGTCGCT ATGACGCGTC GAGGGT-CCTT 2 0450 
CGTCTCCAGA GCCAGGAGAG TCGGCATTGG CATCATCAAC TGGCTGAACC CCAGACG3A2 2 0520 
TATGGCGCGT CGATGGTCCC T3G7C7C3AG AGTCCTCAGA TTCCGCG33C GTCTGCGTGA 2 0SB: 
33G32A3AT2 3 C AAAAG G CT G G G TG AT C CT CCTCACTGGA ATCCGAGTTT 73A33CAO~-. 
ATGGCCTA2A GAAAAAAAAA CAAATATGTC AACCGGACTA GG3TGGCCAA ACCA7TTGC3 
C3A3233733 C2A3T3TT73 333A3GGGA3 A3A737TA33 TTGGTCTTCT CCGATGCTTC 2C7SC 
TC3AGCC3TA -CACTGTGTTG ATA3AAAATT TCCCATAGTG ATGA332A3T GTGTAG3TGA 
GT3CTGGCAT GAACG3ACCA 33AGCATTC7 TTTACCTCGG 3ACACA3GAG GG3C3ACCT7 
CTA _ rTAA TTCCCTGTAC GAC3T3GTA3 T3TT3A33T3 GCAAGCGT27 AAGGGGG3G 2 2 0 940 
GACGTGGTAC ATATTTTCCC AAAAGCCGTA A7GGGCGAG3 CCAG7AAA7C 7CTGGGATG3 2100C 
AGGCC CTTCG ATAG3CATTC 23T3TTAAAA TCAATGAAAA ACTGTAGGCT AT33AGAGGA 210SO 
ATTACGTCAT TACGGGCAGC C 3 GAG 3 AAG A AAT3TTCCA3 TAG AT C TAT C TAGCCACTTG 2112 C 
AC2AAAG3AT ATTTATCAGA G733AAAG2A CCTACAATAA A 37 2 AG AAA. 7 CCAG37AAGC =118 C 
2733373223 C3ATGTTGA3 37373AGAA7 G3777G2373 C3AG3A77A3 23CA2C7CAA 
CA3AAGTAA7 37A37A333A AA33ACAACA 7GC7TCCT32 AG CTTTAAC 3 773AGTCAC3 
GGT3AAAAA3 CATTGCCTGT A77AGA2A3A TGTG7773T7 A37A7GAA73 G73373733A 
G33373G3AA 3AACATCTG3 G373A7GC7G C3373GACGA 3 3777 3 AAA 2 A3G37ATT33 
AT3CATAATG AAG223A2A7 GT773TC7TA C7TTACTAA7 
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SGGACACCCC CTTGC3TTGG 3AGC7GAGTG AATCCGAA3C GC3TAGGAAA AAAATAA"A 
CTCAGA3TTT ATTTT3CAGC CAGACGGTGG CGCTAAC3TT TAATGA7G" CCA3. CJW3T3 
A3"T3GC=A C?CC=A«5C= CA3ATGGGCC TACTATAACA GGAAACATAG AA3~33G3A =1»: 
TAGAGCCTGG TTT3TAA3GG CAATGATATT TATA3TGCAA AAG3GAG3GC 3GTAAGA2AA =1": 
AGGGAGGTAC 33G3ACASAG T3ACAAGAAG ACTTGT3AAA ATTTTAGT" CTGTG3TAAA ll" 6." 
ATGGGGCAAG GTAAATGTGC AAAATGACTG GATAGTGATC CGAGTCATAT .TCA3333ACG : iS4 C 
G3CGGCGGC3 CAGAAACAGG GACGCGTACC GGGACCC7TC AGGTTCTCGA TTAT3TCG3T =190C 
C-ACGTCAAA A33TT3TTGG ATCTCGTGGC G3TGG3ACAG GG3CCTACA7 TTG TCTATT Z ZZ9ZC 
TTCTT3GCGA TGCATTTCCA ACAAAGTATG CTGGGTATTC GAATAATCCC TT3A3AAAAA ZZOZ' 
TGCCCATGTT TGTACCGATG GCCACAACTC C CAT G G AAAA CCTGTCCAGC GTCTSTTCCA =20SC- 
AAGTTCGGTT TGCGTCCACA CTACAGTGGG CCGTT3TGGG AAGTAAGCAT 7TATACGGGG 
GTACC3TCTG ACATATGTGT TCAGGGGAGG CCTCTGGGAC TTGGGAGCAA ATAA3GATGT 
C3CG3GTTAA ATCAAAGT3G GTCTTCACGT TTTCTCC3AA A7AATACA3T 7CCA"ACTA 
33G3CACAAG CTTGT=A==C A3TTTGTAAA TAGCCTGTT7 CTTACTCAGG TA73"GC3A 
GGGA7TGGG7 G3CGGTTAAG A"7TGGGCC 7CA7GTC3GT TG3ATACCAG TAAAA73TC7 
.^^..^-—-r -TrTTGGTCC TCSACGTCCC GGTCATCACG ACACAACGG7 GGAATA3AA7 
CAATAAAATr AT==A=ATTG TCGGAAG CTT GGAAAGA7GA ACCCATGACA GA33 Z 3CCAG 
GTG=caw . =T CTCAAGGGGA T3C3TGGC3G GAA37ACTGA GACA=T=T=C 3T33ACC33T 

,_„,.„.. GACT3C AT;G33CCCT GAGGGCT-G3 AGT7TCACA3 AGAA3TTCAC 

TCAGGTCC" TAA3TCA3GA AGCTCCTGGC 3TGAACCCA7 3ACA3AG3" CCM3TGCCG 
AACTCTCAA3 G33ATGC3TG GCGGGAAGTA CTGAGACA" CTC33TGGAC CC" "TCAr 

_ — z.~~-- -~8CC 

ZTZZ „ ZCZ ~ CT3 « TCGGG CCCTGAGGGC TCGCA3TTT. A — . " 

C3CCTAAG7C AGGAAGC7C2 TGGCCAACAT CTGACAAGAG ATCTAACAAA CAC222TCAA 2286 C 

7GTGA7CCAC CATCGGTAGG CAATCATCCA GGGCACTGAC A7GAC7GGGG ACGGGGCCT7 

CTGGGGAAAA TGGGGTTTGC GAC7G7CCAG CAGGCGGGG7 7AA7AAGCC7 7G7373TCA7 

GTGGAAAAAT AACAGGAGAA GG7AAACCC3 CCG7TGGCAA ACA7AGATCC G7C3GGG7G7 

GCACGTGTAA TGGGCCC7GC ACCTGGC7CG 7GGAGGGACG CGGGGAA7CC GGA377AA7A 

AGCTCGATGA CTGACCAGAT GACCCAAACC 2CGACGG772 7GGC7C772A AAAAA3AAA3 

7GTGCATATC CCTCCCTACA AAACCCTGAG CCCCCACC2A AAG77CG777 7232737CAC 

T-GA-TC-GT A7C7TC3C7C 7GTGACCG7G A7GAAACT7C AG C7G 2 GG AG GA727737GG 

GC37GGCGAC 7GCCGCCGC2 7G777CC7G3 CGGCC7CCC7 AA~-~— 

AAGG7AAG72 7GAG7GACA7 CT7CAA77TC CCGTGATGC3 CG2TG2A3GT ACA7CCCGC2 2 340, 
GCC3ACACAA CC2ACCGGCC AGTACATCAA C CAT C CT AC 2 T2T3GGCTTT TT...7AA3G 234cC 
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7TTAGCATG7 TTTTCATGGC TGGT7CCTGC GTCAAGTACA CAAGACATCC T7CACA7CC2 23 rr- 

T7G7ATGGCC TAGGTGTCA7 AA7CCAGCGG TTGAG7TTCA TTTTTCCCTT ATAGATGG7A 23 64 : 

:C"CT CC7G7C7GGC 7CGATTGGCG. GTCCT7AA7A GCCG7CC AAA GCAGCCCAGC- Z2~:: 

7C7CCGGGA7 T7C7GGCAGC CCG7GCCTAC G7CGC7CCTC CAAAAA7GCZ 2 3"-:: 

TCA7AGAAG7 CA7CGAAGCC 77C7GGCA77 CTCTCCCGCC GGT77CGACC CGG2ACGG7G 

AA7A7TCTC7 T77G77CA7C CAACCACCC7 ACCCCCCAGA AGCG7CCAC7 G7C7AAAGCA 

7CTA7AA7AA AGTCCG7GAG CCA77CCGAC TCCGTGTAGC GAGGCATCT7 77TAGGCAAA 2 3 94C 

AGCCACGACA CAAAACACCT TTTCCG7GGG CGACTTTCTC GCCACAACTA GC7GGACCC2 2 4 CCC 

AACCCCAC7G GCACGTAGAC TCTG7GCCA7 CTAACAACAA AACTCAATA7 ATGCAGC7CA 

ACACCGCCCC CCGCAGCCGG TTGTCGGGCT GCGGAAAC7T G7GGTTAGAA C7CAC7ACGG 

AAAAGGGAA2 CAATGCAG77 GAACTACTGG CACACACCCA 7AACCCGGGA CAGCACCCAG 

GCAC7GTCZA C Z CTCTAATA CAAGCGGCCT T7GGACGCGA GGGAGGGGTG 7CATGGTCAA 2424 : 

CAAACCAAGA AAAACACAT3 TA77ATTCAA 77AGCCAACA ACT77A7TTA 77ACC3ACAZ- 243 CC 

GAGACATGAG A7ACA7AAA7 77CCAACCG7 GCA7AGGG CC AA7 A C CAT C T G7GGAGCG77 243 6 0 

AAG7GCCC7G 7GGAG77T7C GCC7AATTAG C7GAATC7CG ACCCCCA7TG CGGCCAGCAT 2 4420 

GCTCACGAGG AA7AGGCAGC AGAGGCAGGA CCTAACTAGG AG CAT AT C CG GACC7GA7CZ 

AAG7A7G732 ACCAAGG7GA GCAACACTGC CGCCAAAGGC AGGAGAACAA ATAGCGC7CG 

TCGGGAGG2G ACGGA7ACGC CCACGCA7GA CAGTAACCCA ACATAAAATA GCG7CATATA 

C77A7CCA3G CCAA7CAGGA CCGGAG7CAG CAGGCCGA7C GAGGCCG7CG A7ATCAGGG7 

GGCCAGCAG7 AAGGTCACAA ACACGACAAC CTCGCGCC7A CAG7AGGC CC AGGCCTGGAA 

CAC7GAA7AG G7GATG7ACT 7CCCGGGCA7 GA7GAATA7G GCCC7CC7CC 777GCA77CZ 24 7SC 

GGCCC7GAT2- 7ACACA7GC7 GTTCCAGG7G CCTAAA7GCZ AAAAG7CCCZ CGACCAAGAA 2 4B4C 

GACAATGAAC- GG2AGCCAGA AAACGCCGGA CACAAAGACC 77 C77AAA C A ACAGAAGGTA 24S0C 

GTACACCATA AA7GCTCCGC AGAAGCCCAG C72A7AG7A2 27G7G7AC7A 77GG CGG CG Z 2 4 96C 

CTGA7AC A Z Z GCCG77GCGG TGGC7AGCGG A7AAGG7AA2 AG C AG 7 AAA C AGT7AAG7A Z 25020 

GCACAGAC22 GG7A7GAAGG GCACACGAGA AAATG T AAA C CCAGAAAAGG CCGCGCAAAZ 250SC 

7ACAGCAGCA AACACTGCTG ACGCGCAGA7 C C A77 C GAG Z CTCCGG7CCA GC7G777T7G 25 24 0 

CGCCGCAGGG CACAGACACA 7GCA7ATCAG GGCCAAGTG2 GTGACTGGCA GCGACCAGAA 2S20Z 

AAACACGGCC G7GATC7CTG TGG7AAAGAG 7G7GAACGAG 7ACAGGGCC7 7GAAGATAAA 2=26 0 

ACACCACAGA AAGGGGG7CG CCGCCAACG7 CCCGC7CAGA 7AAC7GAAGA G 2GACAGAG Z 2 5 22C 

GCGCTCAC7G 7CCAGGCGGC ACA7GG7GTC AAATCAGGGG GT7AAA7GTG G7777GGGCA 2 5 3S: 

CC7TCCCACG A7CCCTGGAC 7GGC7CGAG7 C7GAGCGCC7 C77G7GAGG2 27C777G7G2 2 54 4 C 

TG7CC77AG7 7GGCGCCGCT GGGGGG2AGC TGGTGACAGA GGCAGCG7CC 7CAGAGGCG7 

CC7CCAG2GG CCCAAAGGGA CCAAC7GGTG 7GAGAGGGGG AGAATCCGGA GAC72CAA77 
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C2GGCTGCCT CCTGGAGT-C GGTA7AGAAT CGGGAACGT7 TTGCGAAGAC TCGC=TCC=: 

CGGCAGACAC AGATCGGTTT AC 27 27AAAA G7AGGACACT TAA 2777 ACG 7CA2C7GA77 2 56 = 

GG2AGCCAG7 GGGCACACC7 TCCACTTGTA A7AT7TCG77 GGAG7G22AA A72AG333G: 

GGGTAAAG2A ACCCGGGAGT 77 A C ACAG 7 C TGAGGG2GGC GA77AAGGA2 722A3G37A^ 

CCCGGGTGAG GGCGTCGGTG TG2ACCACGC CCACAT2CA2 CG AGTT777C C23772AGAC ZSScC 

7GCCAG22 AGAAACGGG7 TTGGTTTCTG GCTTGAAATC AATGATCTTG CT2A7G23A3 2 5<=:C 
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CAAGAGAAAA TGTCACGATC GACAG CGTCT CG2TGACAGA CACAGTCACC GT77GG7CC7 
r ,_ rTT3 - TTT TTGCTGCCTT AGCCAC7TAA G7AGGAA7G2 ACC2GT777G CCA2AGAGGA 
GAAGCC7GG7 GG7G37ACCA CCGGC7TCCA 7CCGA7CG7G GAAAGG7AGG A7A3C27777 
GGTCCACCAC GCTTTTGTGC ACGGTGGAGG TGAGGT7GTC CCCG7AGGAA A7GG7GG72C 
TGACGAAC7G CGGT7GGGCC CCCG7A7CGC ATGCCTCCCC C77T Z G A7AA AAGG27ATG2 
CAGCGTCGAG TACATTCGCA CCGAATAGCT CACGCGTGTG CG7GAAGCCG CTACGGAGGG 
ACG7A77C27 GAAG2TGAAG C7AACGTCT2 CACTGCCTTC CG7G7G7G2C A32AGG3GCG 
7AAGGG2ATT C777ATTCT7. AAC Z C 2 AG AA CGCCAGCTG7 CCCCACGC7G GACAGGA2A2 
TGAGG3773G CGTGCAAGCC GA7G2GTGCA C7TGCAC7A2 7CCGG7777A G7G2-2A2TC7 2646C 
"AA7G73773 A77GA3C273 C7GA7777AG ACAGGAGGG7 CA3G77CAGC G7GA23C2A7 2652G 
AG7GAAAA72 CACAGGGATG A77GCGGCCG 7AGACGCACA GAGAAAT2AC AGGAAAGC7G 
CZZGZ . nC ^ z GGGTGA72T3 GAGA 3G AT AG AG7G2C77AA ATAGAAC777 TAGGGGAGG7 
G3AAG7G732 GACA73GA2A GG77AACG77 CACAAA72G7 2AG72A2ACA C37GG7G7AA 
73A3AA77G7 2T3GG7CAAA AAAATTCACA G Z C77 G AAAC 7G33GG7G7A 7GAGAGGGGG 
CACGC772TG GGG3AGG3G7 GCCAAA7ATG GGAGGAACGA AAA7A72A2G 2AGAA72C7G 
73AGCGG7GG C7772AGGAA C7772GGA7G 7C2AC2A7G7 7AA2AAG2G7 7AC222GGC2 2658 C 
G22T7GG37T GGA7AAAC2G AA727CAA7A 772A27G227 2G37GAA2AG CGG37GGAC3 26 94 0 
7CTG2GTGAC TGGG777TTC CTGTATC7C3 ACGA7AGTGT TG7AGAA2A7 AC7GG2GGCC 
77GGTGTG 2 A GCAG37CG72 237G3AAA7G TAATCG7TGG 2AAGG2A2A2 C23G332A73 
A7GCC72G3A CCCTGCACAA A77GA7AGAG TAGAAGGAGC 7AATAAAG7A TA72222TC2 
AGAA73AAAA A2ATCAGAAT 37727GAG27 77GG7GGT2G C77TA2GCA2 CC7GGAGTGA 
AGCCAC7G2A GC77CTCG3A AAGG33GGGG TC3AAAA7GA 7G77GG2AG2 A7A7G27AGA 2^24 C 
AG777G727C GAC7GTTG77 G AAAAA7 A7 3 772AAGA7A7 7G33A7A2A3 GA2A2G37GG 273 OC 
ATATT2TCCA TGGCAACCTG 7T2GG2A7AA TAG7G3GG2A 237GG7G3G7 G77AAAA777 2~36C- 
G7GACAA3G7 C37CAA7G77 AAAG77AA2T AGG23772GG CCA77232AA AAA 23 7 AAA 2 
AAAAA7C7A7 AAAAGT C Z 77 G72G33A72G 2TGA3C7GG7 GCAG37G3GA AA2A72AAGG 274 EC 
73CAGGGGTA 737GGC7AGG AAACCA72GG T727GG3AAG 7372333337 7AG3333AAA 
AATC3GTCG7 GATCGC7737 A7A3AGAAA7 2GA72AAC7G AA7C3A7TGG 3372A222GG 
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CT73CAGAGA CCTACCTAC7 GACAGACCAG GCAC7CGGGG 7C7G7CGCGC AGGAC7CC7: Z'^: 

CTCCGGGT77 TTAGGTCCGG GTAACCACGC CCCA7C7TG7 TTZATCCCAG AG7GAGGCGG Z"2 2 

7GACCC7GGA TCTGCCAGGC ACTGAAGAGC CG7CAGAC7A GAT7GC77C7 GAACCC7ACA ZT7E: 

G7AG7ACATG AGGGTTTTTA GAC CAAGCCT G7A7CCA7G7 AG C AG C AG G T CCC7AAGA7A 2 "54 1 

GCTCGCATTC CTGACTCTGT CCTCC7TGAG GAAGAAGCTC ATGGAC7GGC TCTGG7C7A2 2"S:: 

AAACGGCGCC C7GGCACGAG CCC7G7CCAG TAG CTT AAAT GGACAG7AAT CAAAGGC7G7 "7960 

TAGGAATACC C7ATATC777 CCCTG7GATG CT7GGGGAAC G7GGAAACG7 CCCCACCATA 2SC2C 

CTG7CTAACC ACCCGAAGG7 CGTCGGGGAG AACC77777A AAAAAAGTCA CA77GGGCC7 13 08 : 

CAACACC7C7 7C777A77GG 7GACC77GGA AGATATA77A GCAAAAAAGG GG7ACACAGA 2 514 0 

C7CGGCA7AG CCAG77AC77 GCGAGGTCCC AGCCGTCGGC ATCACCGCCA GAAAC7GAGA 2S2 0C 

A7TGAA7A7G CCATGCTCGG CAATGCTCT7 TCCCAACGCG TCCCAGCGA7 GGCG7GGTAC 2E250 

AAACGAAGCA TCC7CCCCC7 CCCATGTTTG CCAA7GAAAC CTGCCC7TGG C G AAG 77 A C 7 2532 0 

GACCTCCCAG CCATGAAATG GGACACCCTG TCCC7CCAAA ACAAGG7TG7 GAC7AGTC7C 253S0 

CACCGCGG7G 7AG7ACATAG AC7GGAATAT A77C7TGTC7 AACTCAGCGC TCTCAGCATC 2 34 4 0 

GAGGTACCCG 7ACCCCAA77 CCGCAAACAC ATCCGCCAAC CC27GAACAC CAA7CCCCA7 2 6 500 

AGAC C7CTCC 77TTGACCTC GCTCGACCCC CGGTG7TGGA TGGGAACCAC CCAGAATG2A 2 6 560 

GGCGTTGA7G ACGAGGACTG CCACCC77AC 7GCG7CGCCC AAGGCC7CAA AACAAAAAAA 2 8 6 2 0 

CGGCC7G77G GCG7CCGTGG TGCCAACCCT CGCGCT77CA ACAG77C7CA GACAC77TGG 286B0 

AAGGCAGA7A 777GCCAGG7 7GCACACCGA AG7G777C77 CCTG3CAG77 GGAC7A7CTC 23740 

7G 7 AC AC AAG 77TGAGCAG7 7 AATGG CC AT GCCCTGAG7G 7C3GTCCAGT GG7G7TCA77 288C0 

GAGCG C77CT 777 AAAAG C A CGTACGGTGA GCCTG7C77T ATGA73G7G7 G3A7AAGAG7 28B60 

GAACA7CA7A GACTTCAACG GCA7GCAAC7 AACG7AC777 CCAGCCCGCA CCAG3C3C7C 2 6 920 

G7A77CG77A 7CGAACGCAG CAC23TATA3 C7TAATCAAA TTGGGGGCGG 73GCTGGATC 2 9 56: 

GAACAAA7AC CATAACTTGC- A7GGG7CCTT 77 CA7ACA7C CTGAAAAACA A7G77GGGA7 2 904 0 

GCACACGC2C 7GAAAGAGAC TGTGACATC7 GTCGGGA77 Z 7CCGG7AG77 7GGCG77CAA 2 91CC 

AAAA7CACAG ATT7GACTGT GCCAGAG77C CATGTATGC3 CTCGCGCCAA CGGGCC7GA7 2 9160 

G77A77G7CA TTGAAA7AA7 GAACCTGGGC AT C C AC C AG 7 77GAGGCAAC 733C7A7G77 2 5220 

C7777GG7GG GAGAA7GACG T AA C AT CC AG ACCCACGCCT GACTTACTGG CCAGCAACG3 2 52 BO 

AC7CATATCG 7GG7ACAGGG CG7CCAAAG7 ACCCGACTCA TTCA7CATGG AG3GZTGCAC- 2 3340 

AA7AAAACAG CTGGCGAG77 37CC3CC77C GAC7CCAGC7 GAGCGCAGTA TTGG CGTGG C 2 94 02 

GCAGCACACG TGC7GCGCAG GGAGGTAGCC AAAAACGTAC TCCA C7A7A3 0CAT2TCAGA 2 9460 

TACAGACTTA GCGTCCTCAA 7AAGGTCCC3 CGCCAACCAA T^CAGGCATT 2ATGCTCTAA 2 552 0 

GCACTGACAG GCAACAAACA CGGAAACCCT CATAAACA77 TGCGCCACGC 777CA7AGAC 29580 

AGG27CT37C CCCATGGTCC TTAG3ACGTA AGTA7CATA2 AA2CTCACGG CC3A7AGG7A 2 564 2 
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GCCACAG7TA AGTGTGTCCT CGTAAG CTTT GGACCGTCTG TAGGCGCACA ACA7ATGTT 
CAAGGCA7CA A7GTTGT7TT GAATAAACGA TTCCACCCGA 7G7CCCAACA CGGCTC3AAA = "6: 
AAT Z CC AAGA TACTG CTTGA GAGTCGC7GG GCACCTAGCC 7CGATAATT7 GG7GCGA2AG 2 9E:: 
CCGCCGCGGC ATGGCATTGG CCCGCACGTG CCACCCGACC CTAACCTTTA GAAAGTGTA7 2 
GAGAGATTGG G C A 3 A™ AT AT CAAAATGGGA CAATTGT CC 2 GCAGACACCT GAGACGGGC3 2r*4; 
TGGCTCTGGT GGGACAGCTC CCAAGTGAAC CTGACAAAAT GTCCGGACAG ACATGAGG7T 3 0C0 2 
ACAGAAACAC AGTCGAGGGG CCACACGCGG CCTCAAAGTT CGCAAACACC AGTACAGG2A 3 006 2 
AGGAGGTGCC CTTCAGGTTC AGAZTTTGGT GCACCGGATG AGAAT CAAAG GGAACTGTG2 2 212: 

CCAGCGTACA AACCGCCCCA AAAACAAGCC GATTTATATA CAGCTCGTGC CTCAG GTGAA 2 2 2 S: 

TATACTTGGT CCGGATTACA TCCGTAAAGT GATCCTTTAT CATGGCCAGA ACGTCGGCAA 2 024G 

AGCCC77CCC AGACTGGAAA AACGTCAGGG CCATAGATGG TCTC7GGTTC ACACGGAGAT 3 03CZ 

AAACCAACGA GGCATAAATA GTAACGTTTA GGCCTGCCGG TTCCCGGCGC TGGA22ATGG 3 03 6 2 

GACATGACTC ATCCAAATCA ACT AG GAT AT GACAAGGGAG GGTCAAGCCT ACGTGTGCA2 3C42C 

GGGGCTCGTC CCGGGCCAA2 CZP^ZTZCCT TCATGGCGGA GGTGACCTTG G T C A CG AAG G 3 04BC 

TACTGTGGAC AGTC7GGACC ATTGGACCTA CTGGGGTAAG GAGGGTATGA AACTGGGCAG 2 054 2 

TGTGGATGAG TTCACTCAAG TTAGGGATGA AATCCGCCAG GCGGGATGCA CTTGCGTAG7 2060C 

A2A3A23G37 CACTTTGTGA GTCTGTGGCG CTTTTGCCGC TTC2ATTCCA GAGAG GATAA 30660 

ACAGGGAC 3T GGGTGTTAGC AGCATATGGA TAGACGAGCG GTTGTCCTC7 TGGTTGAATG 2 0-2 2 
■ AAAATAAAAA GGTTCCGAGA GGGTCCTGGG GAGTAAAGGT CTG7GAATA7 A2GAG 
CTCCATAGGT CGGGTGGTTA AACGGCGCCT GCCGCAAGGC 777A7G7AG2 GAG 2 
TGGGTCGTGT GGAZGGCG2A TATTTAGAGA GTAAATCCGG CAGCGCCGTG GCAAA7TG2G 
GTCGTCTAGT GAGGGATACC CGGTGAGTTG GTGGAGGTAA AAGA 2 32 AA 2 A777G327A2 
CCAGGCGAGC CGCATTTT2A GCCTG2A237 TCATATCCAC GCGGGCAATG GACG32AGAG 

AGGCTGTTGA AAAG2TTACC AAAGGCGTGA GTGGGGGAGG CGGGAGCGTT 2A22AGA2AA. 3103C 

AGCTGTTGAT G G AATTT ~ AA 2T27GAG3A2 TGGCGGTGCC TGG7GTGTTA AA GAG GAG 2 A. 3114 2 

CAACAGAGGA GTTTTTAAAT AGTGTTGGGG AACTGCCGAG GGACGTATGA AAAT77A7A2 2I20C 

GCGAGTATGG CGTGTTCG2A CTGGTTCGGG CGGCGTATTT TTTAGAAGG2 2 777 27 AG 2 A 312 6 2 

T2GACCCCCT TGAGGCAG2G CGCGGTCTTG GACGCCTGGT TG AT AT ATT A 72A77A2AA7 2122C 

7A22G2AGAA GAC2GGAG2G GGGCAGCCAC CCAC7TCCGA CGACACCGTG AA7AA2TGTA 3 1330 

CATTGCT7AA AGTAGTAGC7 CAGTACGCGG AT C AG AT AG C AG G TTT G AAA A2C22 22777 3144 C 

72 27T22237 G23A777G2A ATGATCGGCC TGTTCACATG 2GTGGAAGAG ATGTA2 2A2G 21 = 01 

7ATGT7772A. GAAATAGTGG GGAGCTGCAC TA72322AA7 GTGGATACTG A2ATA23A2G 2 156 2 
GTCCGAGTTC TCCGTTAGAG GACTGGCTTA TAGTCGGGTA TGGTAAGAAG GAAGGAGTG7 
TAGTGGGGTG TGG2A7ACGC TCGGAGGAGG TGTTAGGCAA 
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AAGAACGCGC CCTCGCCATC TACCGGGTGT TCGCCAAGGG TGAGGTGGTG GC3GAAAATA 
CTCCCATTGT TGCCTTCACC GACGTGGAAC TATCCACACT CAAACCCCAC TATCTG7TCA 
TCTATGATTT TATCATAGAG GCATTATGCA AG AG CTACAC ATACTCATGC ACCCA3GCC2 
GCCTGGAATC CTTTTTG AG C C G AG GT AT AG ACTTCATGAC TGACCTAGGT CAGTACCTAG 2 19SC 
ATACCGCTAC TAGCGGCAAG CAGCAG CTGA CGCACAGCCA AATAAAGGAA ATCAAATACA 2 2 04C 
GGCTGCTAAG CTGCGGTCTC TCGGCTTCCG CGTGTGATGT TTTCAGAACT GTGAT CATG A 32 IOC 
CCTCCCATA TCGACCGACC CCCAACCTCG CTAACCTGTC CACGT7TATG GGGATGGTT2 
ACCAACTGAC CATGTTCGGA CACTATTTCT ACCGGTGCCT GGGCAG CTAC AGTCCCACCG 
GCTTGGCCTT CACAGAATTG CAAAAGATAC TGACACGCGC CAGCGCGGAG CAAACG3AAC 
GTAACCCGTG GAGACATCCG GGTATCTCGG AC ATT C C A GT GCGTTGGAAA ATATCGCGTG 3234 0 
C-C-AGCAT7 CTTCGTCCCT CCGGCCCCCA TAAACACTTT GCAGCGCGTG TACGCCGCGC 324 0 0 
TGCCCTCGCA ACTCATGCGG GCCATCTTCG AGATCTCGGT CAAGACCACA TGGG3AGGCG 
CCGTACCGGC AAACCTGGCG CGCGACATTG ACACAGGACC GAACACACAA CATATCTCCT 
CCACACCACC GCCCACCCTC AAGGATGTTG AGACATACTG TCAAGGTCTG CGGGTGGGAG 
AGAGGGAGTA CGATGAGGAC ATTGTGAGAA GCCCGCTCTT TGCAGACGCG TTTACCAAGA 
GTCACTTGT7 GCCTATACTG CGCGAGGTTC TGGAAAACCG CCTGCAGAAA AA C AG AG CT C 
TGTTTCAGAT AAGATGGCTG ATAATATTTG CTGCCGAGGC GGCAACGGGG CTCATCCCTG 
CCAGGCGC3C GCTAGCCAGA GCCTACTTCC ACATGATGGA CATTCTGGAG GAGA 3 AC ATT 
CCCAAGACG3 C 2 TAT AC AAC CTTTTGGACT GTATCCAGGA GCTCTTCACC CACATCAGGC 3233C 
AGGCTGTTC2 AG A CG CA C AG TGTCCGCACG CCTTTCTACA GTCCCTGTTC GTCTTTCAAT 22 94 0 



2 216 0 
2 222 0 
22280 
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32520 
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3264C 
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TCCG3CCTTT CGTACTCAAA CACCAGCAGG GTGTAACCTT GTTTCTAGAT GGCTTGCAGA 33000 

z ^ czcrczz CCCGGTGATA AGTCTGGCCA ACCTTGGAGA CAAGCTGTGT CGTCTCGAGT 3306 0 

TCGAGTACGA CAGCGAGGGC GACTTCGTGC GCGTGCCAGT TGCACCGCCA GAACAACCAC 3 312 0 

CGCACGTACA TCTGTCGCAT TTCAAGAAGA CAATACAGAC CATCGAACAG GCCACCAGGG 332BC 

AGGCCACCGT AGCCATGACA ACAA7CGCAA AGCCAATATA CCCCGCCTAC ATCC3GTTAC 3 324 0 

TGCAGCGGC7 AGAATATCTT AAC AG ACT C A ACCACCACAT T CT C AG G ATT CCCTTCCCAC 3 3300 

AGGACGCCCT TTCTGAACTC CAGGAAAC CT ACCTGGCGGC GTTTGCACGG TTGA3AAAAT 3 3 360 

TGGCAGCGGA CGCAGCAAAC ACTTGTAG CT ACTCCCTCA3 CAAGTACTTT GGAGTTTTAT 2 3420 

TCCAACACCA GCTGGTCCCC ACGGCCATCG TTAAAAAACT GCTACATTTC GACGAGGCTA 3 348C 

AAGATACCA3 AGAAGCCTTT TTACAGAGCC TGGCA2AAC2 CGTAGTGCAG GGACAACGGC 3 3 540 

AGGGG3CG32 TGGCGGGTCG GGTGTCCTGA CGCAGAAAGA ACTTGAGCTC TTGAACAAAA 3 36 0C 

TAAACCCACA GTTTACAGAC GCTCAGGCTA ACATTCCTCC AT CT ATT AAA C3TTCATATT 3 3 650 

CAAATAAATA TGACGTCCCT GAGGTCTCAG TCGACTGGGA AACGTACTCC C33TCTGCCT 2 2 720 
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TZGAGGCAGG GGACGACGAA CTCCGTTTTG TCGGAGTGAC GG7GGCAGGZ CTGC33AAAZ 
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TGTTTGTGGA AT AG AG G C CA TGGCAGCCCA GGGTGTGTAG A7GGAGGGAA T3GGGTCZAZ 

CCACCAAGCT AAGTGTATAT TCGGAGAACA TGGTGGATGC GAGTGGGTGA GGAAGTGGG7 3 3?: 

GATGTAGGTG GCGTCCAGCT ATT AT AA C AG CGAAACCCCC GTZGTZGAZA GAGZZAG ZZ7 2 = ? = 

GGACGATGTA CTTGAACAGG GCATGAGGCT GGACGTCCTG ZTAZGAAAAT GTGGGATGZ7 34 2 2 

GGGATTTAGA CAATATGCCC AACTTCATCA CATCCCCGGA TTCTTCCGCA CAGAGGAGTG 

GGGGAGGAAG ATGTTCGAGT GTGGAGAGTT TTATGGGCTC ATGGGAGAGG ACGCGGCCAT 

CCGCGAGCCA TTCATCGAGT CCTTGAGGTC GGTTTTGAGT C G AAA GT A GG CGGGGAGGGT 

AGAGTAGGTG AT C ATT AT CT GGGAGTCGAA AG C CG GAG C A ATZGTCGTGA ^ 
GTATTACATG TTTGAGG07C AGTGGATACC AAACATCCCC AAGAGTGGTG 

AAAGACTAAC GACGTTGGCG TTTTATTACC GTACATAGCC AGAGATGAGA CTGAATACA Z 

CGGGTGCTTC GTTTAGTTTA TCCCACATGA CTACATCAGC ZGAGAGZAG7 AZATGG7AAA 

CGACTACCGC ACCATTGTGT TGGAAGAACT CCACGGGCCC AGAATGGATA TGTGZGGGGG 34 5 0C 

GGTGGAATCA TGGTGGATGA CCGAAATCAC GTGZGGTTGT GTATCZGGGG GGGGTAGTGA 34 56 0 

GGGAZOATTG GGZAGGGAGT OCAZZZAATG ACAAGACGAA AGGGGZGZGZ G GAGA C 2T ZG 346ZC 

CGTGGTGATT CCTCGTTACG ATGCGACAGA CCGCGGACGA CGGZ7TGAGZ AAGAZZGGZZ 34 6 6 C 

GGCAGAGCAG GZAGCGGGAT AGGGTGGAAA CAAAGGAZGZ GGGGGTAACA AAGGAZGG3G 3474 Z 

CGGAAAGACG GGACGTGGCG GAAATGAAGG AGGGGGTGGZ ZAZZAGZZAZ ZA3A7GAGZA 34 90C 

ZGAGZGZGZA CACATCAGCG CGGAAGACAT GGAGGAG7G7 G A Z 3 G A G AA G GZ3ZZGATGZ- 34 66C 

AG A GAT G GAT AGTAGACZZ3 GAAATGGTGA GAGATCZGTT AZ3GAAAGZZ CGGGZCZZ3A 

AGZZAATZZZ GZAGZAGGGC CTGAGAGAGA GZGAGZGZZC AZTZZZGZGG GGAZZZZAGZ- 

GGZGACAGGG CTGGTGTGTG AOZTAAZTGZ GAZAAGA3GG Z AG AAA GO Z A AATTTTZZTZ 2504 0 

G GTT AAAG AA TGTTATZZZA TGGAGAGGCC AZZZTZTGAZ GAZGATGAT3 TGTZZZA3ZZ 

;2) INFORMATION FOR SEZ; ZZ KG: 20: 

( : i 3 ZO L"EN ZE ZrlARAGTZF 1ST" ZS : 

(A) LENGTH: 3 2 2 C ~ base pairs 
= TYPE : nucleic acid 
tC) STRANEEDNZSS : CDiicle 
iD< TOPOLOGY: Linear 
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2 B i 0 : 



MOLECULE T:F£ : GNA (gen 



z i 



,X1) SEQUENCE DESCRIPTION : 3 EC' 2D NO: 20: 
GTGZZAAZAA AGGGGTGZGG ATAZTGAAGA TATTTGGATT GAZZ-AZCZAZ TZAZAZZZT7 
GTAZCZAGTA AGGGATAGAZ CATCTTTZGA ZATAAGGG ZG GAZ3TZAZAZ ZZGAZAAZAZ 
ZZACZGG3AG AAA3GAGCGG A G G 3 G G A GTT TAGGAAGAAG AZZAZAA3ZA ZG3ATGZ33A 
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CAGGTA7GCC AGCGCCAGTC AGGAATCGCT GGGCACCCTG G7CTCGCCAT AC 

AAA CTTGG AT ACACTGCTGG CAGAGCTGGG CCGGTTGGGA ACGGCACAGC CTA 

AATCGTGGAC AGACTAACAT CGCGACCTTT T CG AG AAG CC AGC3CTCTAC AGGCTATGGA 2 6 : 

TAGGATACTA ACACACGTGG TCCTAGAATA CGGTCTGGTT TCGGGTTACA GCACAGCTGC 42: 

C C CAT C C AAA TGCACCCACG TCCTCCAGTT TTTCATTTTG TGGGGCGAAA AACTCGGCA7 4 E C 

ACCAACGGAG GACGCAAAGA CGCTCCTGGA AAGCGCACTG GAGATCCCCG CAATGTGCGA 54 C 

GA7CGTCCAA CAGGGCCGGT TGAAGGAGCC CACGTTCTCC CGCCACATTA TAAGCAAGCT 6QC 

AAA—CCTGC 7TGGAATCCC TACACGCCAC TAGTCGTCAG GACTTCAAGT CCCTGATACA 65: 

GGCATTCAAC GCCGAAGGGA TTAGGATCGC CTCGCGTGAG AGGGAGACGT CCATGGCCG?-. ~2Z 

ACTGATAGAA ACGATAACCG CCCGCCTTAA ACCAAATTTT AACATTGTCT GTGCCCGCCA ?ec 

GGACGCACAA ACCATTCAAG ACGGCGTCGG TCTCCTCAGG GCCGAGGTTA A C AAG AG AAA 84 C 

CGCACAGATA GCCCAGGAGG CTGCGTATTT TGAGAATATA ATCACGGCCC TCTCCACATT 9CC 

CCAA3CACCT CCCCAATCGC AACAGACGTT CGAAGTGCTG CCGSUCCTC* AACTGCGCAC 96 C 

GCTCGTGGAG CACCTGACCC TGGTTGAGGC GCAGGTGACA ACGCAAACGG TGGAAAGTCT 102C 

ACAGG CAT AC CTACAGAGCG CTGCCACTGC TG AG CAT CAC CTTACCAACG TGCCCAACGT 103C 

C C A C ACT ATA CTGTCTAACA TATCCAACAC TCTAAAAGTT ATAGATTATG TAATTCCAAA 

ATTT AT AAT A AACACCGATA CACTGGCCCC ATATAAACAG CAGTTTTCAT ATCTGGGGGG 

TGAACTG 3 CA TCTATGTTCT CCCTTGACTG GCCTCACGCA CCTGCAGAGG CGGTAGAGCC 125 0 

ACTA2CC3T3 CTGACTTCTC TGCGAGGTAA AATCGCAGAG GCGCTGACG3 GTCAAGAAAA 132 C 

CAAAAAC3CT 3 TAG AT C AAA TTCTAACCGA CG3CGAAGGC CTCCTTAAGA ACATTACCGA 1380 

TCCAAAC333 GCACACTTCC ACGCCCAGGC CGTAT CAATT C3AGT3TTAG AAAACTACGT 144 C 

ACATAAC3C3 G3GGTCCTTC TCAAGGGC3A AAA3A3C3AG AGGTTCTCCC GGCTGAA3AC 15 DO 

rGCZ ^ zcz „ AACCTGGTAT CCTCCGAATC ATTTATCACC GTGACCCTAC AC ACT AC AAA 155 C 

CCTTGGAAAC CTA3TTAC2A A23TACCAAA ACCTGGT3AG GZGTTZ^CCG GG33CCCGCA 162 0 

CCTCCTGACA AGCCCGTCCG TGAGACAGT3 CCTTTCCACC CTGTGCACAA CCCTGCTGC3 163 0 

AGATGCCCTG GACGCCCTGG AAAAAAAGGA TCCGGCCCTT CTTGGTGAGG GGACCACGTT 174 C 

GG C G C TG 3 AG ACACTCCTAG GATACGGGTC GGTGCAGGAC TAC AAG GAGA CGGTACAGAT 18CC 

AAT AT C CAC C CTTGTGGGCA TCCAAAAATT AGTCAGGGA3 CAGGGCG3G3 ACAAGTGGGC 185 Z 

CACT3CC3TG AC AAGG CTAA CT3ACCTCAA ATCAACTCTG GCCACGACCG 3 CAT CG AG A C 1520 

GGCTACGAAA CGG AAA CTAT A 2 AG ATT GAT CCAAAGGGAC CT C AAAG AG 3 CT2AAAAACA 1S8C 

CGAGACCAAT CGGGCCATG3 AGGAATGGAA GCAGAAAGTA CTGGCTCTTG ACAATGCGTC 204 0 

T3CGGAACGT GTCGCCACCC T3CTGCAACA GGCTCCCACC GCGAAGGCTA GAG AGTTTG C 21 CC 

AG AG AAG CAC TTCAAAATAC TACTCCCCGT ACCCGCGGAC GCCCCCGTCC AAGCGTCTCC 216 C 

AACGCCGATG G AAT A C AG CG C3AGCCCCCT CCCGGACCCA AAG G AT AT AG ACAGAGCTAC 222: 



PCT/US97/0144: 

WO 97/27208 



gacacttccc acatt: 

CCCGTTGCAC AGACAGGT33 TGTTCTCCAG CTTTTTGGAG GCCCAGATCC GAT7 
GT „ GTAG::G 3GCrrcGTGC CTGGACGGGG TCTGCCCGGA ACACCGCAGA 7CCGAAGGGG 

CGTGGAGGCT gccg: 

GGTACTGGAC ACCTTTTTCC ACAACGCGCC CCT7CCCGCA GAGTCTTCCT CCAA73CTTT 

CCTGGCCATG TGCGTATT3A CGCACCTT3T CTACCTAG7T GGGCGCGCCG TC77GGGCCC 

ACGGGAGCCG GAGCACGCGG CCCCGGACGC G7ACCCAAGG GAG37GGCG7 7GGC2CCGCG 



22= : 
2 34 : 
i-iz: 

2 4 C - 



157 

ATCCATCCAC GGGGAACAGG CGTGGAAGAA GATACAGCAG GCGTTZAAGG ATTT CAA.CT 

CGCCGTCCTG CGGCCCGCTG ACTGGGATG C CCTGGCAGCG GAGTACCAAC GCCGT3GTT 

GCCCCTTCCG GCGGCCGTGG GTCCAGCGCT ZTCAGGGTTC CTGGAGACGA TCCTAGGGA 

GCTGAACGAC ATCTACATGG ATAAGTTCCG CTCCTTTCTG CCCGACGCGC AGCCTTTTC 

GGCGCCGCGC TTCGAGTGGC TAACGCCGTA TCAGGACCAA GTCAGCTTT7 TCTTGCGCAC 252; 

CATAGGGCTG CCGCTGGTGC GAGCGCTGGC CGACAAGATC AGCGTGCAGG CACTGAGGCT C5BC 

TAG CCACGCG CTCCAGTCCG GCGATTTGCA GCAGGCCACG GTGGGCACGC CCCTGGAGC7 

CCC7GCCACA GAG7ACGCGC GCAT7GCCTC CAACATGAAG TCCGTGTTCA ACGACCACGZ- 

ACTTCAGGTG CGATCAGAGG TCGCGGATTA TGTGGAGGCC CAACGAGCCG AGGCAGAGA2 

GCCACACGTC CCACGTCCAA AG AT AC AGG C ACCAAAGACT CTGATTCCAC ATCCGGACGC 

AATCGTCGCG GACGGACTAC CCGCCTTTCT TAAGACGTCC CTACTGCAGC AAGAGGCCAA 

ACTTCTGGCG CTACAGCGGG CGGACTTCGA GTCGCTCGAG AGCGACATGC GCGCCGCAGA 2 94 0 

GG C C GAG AG A AAAGCATCGC GGGAGGAAAC C C AG CG C AAA ATGGCACACG CCATCACTCA 

GCTCTTACAG CAGGCACCGA GTGCGATCTC GGGGCGCCCG CTATCCTTAC AGGACCCGGT 

GGGCTTCCTC GAGGGCATCA TATACGACAA GGTCCTGGAG CGCGAATCCT ACGAGACGG3 

TCTCGAGGGA CTGTCCTGGC T C GAG C AG A C CATCAAGTCC ATCACCGTAT ACGCT 
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AGAGGAGAAG CAAAGAATGC ACGTG7TGCT GGACGAGGTG AAAAAGCAGC GAG 
TGAC-ACCGCT CTCGAGCTAG AGGCCGCG3C TACGCACGGC GACGACGCTA GACTCCTGCA 

. AG ~G3 GATGAGCTGT CACCGTTGCG CGTTAAGGGG GGGAAGGCCG CGGTGGAATC 

CTGGCGGCAG AAAAT C C AAA CCCTGAAA7C CC7GG7ACAG GAAGCGGAGC AGGCC3GCC7 
CCTGTTGGCC AC CATAG A 2 A CGGTGGCCGG CCAGGCCCAG GAGA Z CAT AT CACCATCCAC 
ACrcCAGGGA CTG7ACCAAC AGGGACAGGA GGCCATGGCG G C 2A77AAG C GG777AGGGA 354 0 

CTCGCCCCAG CTAGCTGGC'C 7GCAGGAAAA GCTGGCCGAG CTACAGCAGT ACGT CAAGTA 3SC0 
CAAGAAGCAG TATCTGGAAC ACTTTGAGGC CACCCAAAGC GTAGTGTTTA CAGCCTTTCC 26 6 0 

GCTCACACAG GAGGTTACGA TCCCAGCCC7 GCA77ACGCG GGACC777CG ACAACTTGGA 3 72 0 

GCGGCTCTCA CGATACCTAC ACATCGGCCA GACGCAGCCG GCTCCGGGAC AGTGGCTCCT 3 7 6 0 

vCC CCACGCGCCC GGCCTGCGTC CCAGZCGGC2- GCCACGAACC 3 54 0 

3 9 0 0 
2 96 0 

CACCA GTGGGACGAG ATATCTCGCC 7CC77CCAGA 4 020 

4 OE C 
4 14 G 
4 2 00 
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GCCTTCGCAC GCGGAGGCGG CGCACGCATG TCTTGTCACG CTGCCAACAA TGCTCAAGG C 4 22 : 

TGTGCCGTAC CTCACGCTGG AAGCCTCAGC TGGACCAC7G CCGGCGG^ TGCGCCACTT 4 3 5 : 

CGCCACGCCA GAAGCGEGTC TGTTTTTCCC CGCGCGATGG CACCACGTCA ACGT3CAGGA 4 44 0 

GAAACTGTGG CTGCGTAATG ATTTTATGTC GCTGTGTCAC CGTTCCCCGG GGCGCGCGCG 4 5 0 1 

CATAGCCGTC TTGGTGTGGG CCGTCACTTG CCTAGATCCT GAGGTAATAA GGCAGCTGT3 4 56 : 

GTCCACCTTG CGGCCCCTTA CTG CGGATGA ATCCGACACG GCTTCTGGAC TGCTGCGGGT 462C 

GCTAGTAGAA ATGGAGTTTG GTCCGCCGCC CAAGACGCCG CGGCGGGAGG CGGTGGCGCC 4 6 8 C 
CGGCGCAACA CTGCCACCGT ACCCCTACGG CCTTGCCACC GGCG^GCGCC TGGTCGGCCA 

GGCG CAGGAA CGCTCTGGCG GCGCTGGCAA GATGCCGGTG TCCGGGTTTG AGATAGTTT7 4 S 0 0 

AGGCGCACTG CTGTTCCGCG CCCCCCTACG CATTTTCAGC ACCGCATCAA CCCACAGGA7 4 66 0 

CT CAGATTTC GAGGGCGGTT T C C AG AT A CT GACTCCTCTC C7GGACTGTT GCCCAGATC2- 4 92 0 

CGAGCCATTC GCCTCCCTGG CCGCCGCACC ACGAAGGACG GTGCCACTGG GAGACCCGTG 4 980 

CGCCAACATT CACACCCCCG AAGAGATACA GATCTTTGCG CGTCAAGCCG CCTG 3 CTT CA. 5 04 0 

ATATACCTT2 GCAAATTACC AGATCCCCAG CACCGACAAC C CGATAC CG A TCGTTGTGCT 510 0 

AAACGCTAAC AATAACCTTG AAAACAGCTA CATCCCTCGC GATCG CAAAG CGGACCCGCT 5160 

ACGACCATTC TATGTAGTCC ■ CTCTGAAGCC GCAGGGTAGA TGG CCTG AAA TAATGACCAC 5220 

AGCAACAACC CCCTGCCGCC TACCGACATC GCCAGAAGAG GCGGGATCAC AGTTCGCCAG 52 6 0 

ACTCCTTCAG AGCCAGGTGA GCGCCACATG GTCTGACATC TTCTCCAGGG TTCCCGAGCG 534 0 

CCTCGCTCCC AATGCGCCTC AGAAGAGTTC CCAGACAA73 TCAGAAATCC A C G A G G T C G C 54 0 0 

CGCCACGCC3 CCACTCACAA TCACCCCAAA TAAACCGACC GGAACCCCTC ^ZGZZTZZZZ 54 6 0 

GGAGGCTGAT CCAATAACAG AACGCAAACG CGGACAGCAG CCGAAGATTC- TCGCGGACAA 5 52 0 

CATGCCTAG7 CGTATTCTCC CGTCGCTACC GACCCCGAAA C C CAG AG AG C CTAGAATCAC 5 5 B 0 

GCTACCCCAC GCACTGCCCG TTATATCACC CCCAGCACAT CGCCCGTCGC CTATACCGCA 564 C 

TCTGCCAGCA CCGCAGGTAA CGGAGCCCAA AGGGGTTC7C CAAAG CAAAC 3TGGAACTCT 570C 

CGTGCTGCGG CCCGCCGCGG TCATTGACCC ACGGAAGCCC GTCTCGGCAC CGATCACGCG 5 "6 0 

ATATG AG AG G ACGGCGCTCC AGCCCCCCCG GACTGAGGG-C GAAGGCCGGC GCCCTCCCGA 55 2 0 

CACGCAACC- GTCACTTTAA CCTT7CGTCT CC2ACCTACC GZ^CZZ^ZTZ CCGCAACTGC 58B0 

AGCCCTAGAA ACCAAAACAA CTCCCCCATC CACGCCCCCA CACGCCATA3 A C ATT AG CC C 5 94 0 

AC C A CAG A C A CCTCCC ATGT CCACCTCACC TCACGCGA3A GACACAAGCC CCCCCGCAGA 6000 

AAAGC3GG32 GCACCCGTCA TTCGAGTAAT GGCGCCCACG CAACCGTCGG GAG AG G CAAG 6 06 C 

AGTCAAGCGA GTGGAGATCG AACAGGGCCT TTCCACACGC AATGAAGCCC CTCCCCTTGA £12 0 

ACGCTCGAAT CACGCCGTGC CCGCCGTTAC CCCAAGGCGC ACCG7AGCCC GCGAAATCAG 618 0 

GATCCCGCC3 GAGATAAAGG CGGGTTGGGA CACTGCACCG GACATTCCTC TGCCCCACAG £24 0 
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TCTTCCCAAC CTCGTAGAGA GATACGCGCG GGGTTTCCTG GACACGCCCT CTGTAGAG3T 
GATGTCCCTG GAAAATCAGG ACATCGCCGT GGACCCCGGA CTGCTAACCC GCCGGATTCC 
ATCCGTGGTG CCCATGCCCC ATCCAATTAT GT3GTCACCC ATAGTACC C A TCAGT-TA CA 
AAACACAGAC ATAGACACTG CAAAGATAAC ACTGATTAGT TT T ATT AG A C G CAT 3 AAA C A 
AAAAGTGG CC GCCCTATCGG CGTCCCTGGC GG AG AC G G TT GACAGAATAA AGAAGTGGTA 6 6 0 

CTTGTGACTC CACGGTTGTC CAATCGTTGC CTATTTCTTT TTGCCAGAGG GGGGTTTCCT 
CGCGTKGCC ACCGCGGGGG CGGCCGTTTC CGTCGTGGAT GAGAGGGTTG TGAGAATGTC 
TGACGCCGGC GACAATGAAT GGGG AC C AG A GGACAGGGTG GTTATACTGC TTCCCGAGAC 
CCCCAGTGAG TCCTGGCCCC CGGGCGTGGT GCCGGATGCA GGGCCTGGCC TCGAAGGCAC 
GGTGAACGTC CCCGCGTCGT AAGCCGACGC CGCGGAAACT CGGTCAGCGC GCTCGCGCGG 
TTTCTGATCC CTAAGGGTCT GCAGATGATC CCGCCTTTGA ATTCCACCCA TCCTCCTCAG 
ATAGGCCTCA TAATAATGAT GGGCAATTAA GAACACGAGA TAGTGTCTC7 TTTGCACGAG 
GTATTCGGCC TGCGACATAT TTCCCTGATC CAGGGTATTC ATGCGAGCCA CCAGGGGATG 
GTGAGCGTAG TCATGATCCA GTCGCTCCTG GATCACGGGG TCTCTCACCT TAAAGTTGGA 714 0 

CATCTTCCAC ACAGGCGGGC GAAATAGCCT CAGGAGGAAC ACTTCCCGCA ACAGAACTCC 72 CO 

AGCAGCTGTG AGGTGAGCTG AAGCAGTCCG CGCACGTCAC GGTGCTTTAA TAG G G C AG C C 726 0 

TCGCAGTCGG GCGTCCCAAG GCAAGGCACT ACAAAACTGA CAGTTTGATC TAGGTCTCGA 
ATGGCAAGGG CCGCGTTGTT AGCTAGAACA GCCCTGATTA CGACGCGTGC TAGGGTCCCG 
CGTCCGGTAA TAT Z G CAT AG GGGATACAC Z CTCATATGTT CGCTGCCACA GTAAGAACAG 
TAGATC-TCC CCGTGGTCGC ACAGATGGTG AACTGCTTCT CTTTCCTGTC CCTGCTGAAA 
AACACGTTGG TGGGAGGAAA ATTGACAGTA TGAAACTTGC CCCTGCCAAA GTTAAGACAG 
„ 3ZZZ ^„ ZZ CCATGCACAC AACCGCCCGA GCGCAACGC3 CCCGCTTGGC AAGGGCCGCG 
C33GCCACGC GAGAACAGAT GACGGGTATG GACACGCAGG GGGAGAGAAC ATTGTATGCC 
AGAAGCCTCC TGCCAAGGTT CCGCACGAGA CCAGGTCCCT CCTGCTC3CA GGCGGGCAGC 
ACTACGTGGC GGGACTTAAT AAGG CTCAAA AAA C A C AG TG ACCCAAG CAT GGCG7C3AAC 
GG3TTACCGC AGGGAACCGT AGGGGCGACG C G CT C C AAGG CCTCCCGGAG GCCG3TATCT 
GCCGCCCCTA TCCCGAG2CC GTTACCGTCT TCGGTCGCAG CCACACCGCG ACGGGTGTGC 
GAG3GCACCT CCAGGAGGGG ACGACGCGGC AACGGCCCAT GCCACTTCTT Z CTT AG C CAG 7 96 0 

GGTA3CGACG GTGGGGGCTT CGAACAGCAG GTCAC7AACG GAAAGCGAGA GCAAAGCGCC 8 04: 

AACAGCTTGC AGAGTTG33C ACAGGCCTT3 GAAAATG3AA GC3ACAGG7A TTT7GCCCA7 6ICC 
ACGTG3CGCG GTATCGCCCT AGCATGGTCG GGGGCCTGGG CAC3G3ACAG CGTCACCACA 616 0 

ACCCATACGT GGGCGCCAAG CAGCTGCTGC GCCGCACAAA TCTGCGCCTG 777GGCGACG 
GTGTCTGAGC CAGCGCGCAA CACGG CGATC GCC7GCGCCA GCGAC333C3 372CAACAGG 
TGCCTGGCCC AGGAGGGCAT GTTTCCCTG3 AAACCCC3CT CC CC3AA7A7 G A Z AAAAG C Z 
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ACATATTCCT CCACTGGCAC GCCATTCTCG CCCTCGAACA CGCGG7GGGC CGTCAG C7GG 
g::ct=at::::a AACCAAACCA AGACACAAGA AAGCGATCCC AGCGC7GA7C CAGGG C CA7G 
ACCTTCTCAC CAGCGCGACC GCACGGCC7A AGCTCCACTG AAAGGCGCCC AGAA7CCGCA SSZC 
CCGT=CTAC = CCCCTGGCCC GCCCAA7A7A CCGCTGT3AC GTCTGATG7A CAGGCCCGCG 5 5 E ■: 

CGTCGCGGCC G7TGGTGGGA AAACCGGCAC CACCCTGTGC GGCCGAATCC GCCACGGGGG S4 4C 

CTGCCAGACA GTACACTGTC TCCAGCAGCG ACTTCAGTCT CTT3TGACT7 TTGGGCGTCA 
CCACCAAAAA TTGCAAAACC TGCCTGTAGT CCGTGAAGTA GGTACGGCA7 A7TACCATGG 
AGT7GTACAC GCCCAGG77C TTTGAGAACA CCAGGC7CGC C77GAAC777 GTAAAGT CAT 8 B Z C 

CCTGCCCCAG CACGACAGAC GTAT7TTTGG CAAGGTATAC GTCCGACTCC ACGGGAAGGA 53 SC 

CGTGCCCAAA CTGGGACACG GCGTCGCTTG GTCGGCACAG AAAGCACTTC AGGGTTGTGG B94C 
AAAGGCCATT ATTCGATATA ACAAAGCAGG GAGAGAACGG GTAGTGCA7C TCCTCCAGGA 90C 0 

GGTGCGCCCA AAAC77ATAC ACAAACTCTA AGTGG7ACAC GCAACCGTGC TGCAT7CTAA 906 0 

CCGTACA7A7 GGCGG7AGCA CCGCCCTTAG CA7AAAG7GG GGCCCZG7CG ATGCACCG77 
CCAAATCCAG GGACTGACCA GACTGTCCCA AG7ATGAGGA 7ACCACCCGA CACAGTTCG7 
CCACTACACG CTTACCAACG AC ACT CAT GG CGACAGCGGG GTGGGGCTGG CAAGGCCCCC 
AAAG CG CGAC AC CCGCAG7C AA7CAGGGCC G7GCCCGCGC C7CGGAGAA7 ^ZGGCGTZCG 
TGCTCACGAT C7TGCGCAGG ACCTGCCTTA CCGTG7CCAC CTTGC7C7CC AA C A CC AG AG 
7ATGATCGCA GGCTG CAGGC 7GTGCCCGC7 GGACGAGAAA GGTTT77AAA 7AC7GACAGT 
AGTTGA7GG C GTTCAATCTA CAATAGATCG TGGGAAATAA AA777GCA7G 7CACGAGGCA 
GAAGCTGG7C AGACGCG7AC TCCA7GTTGG G7TCCACGGG GAGGGGAACA CACGCCCCAA 
GACACGACGG CG CACATAGG GAGCGGAG CA AACAA7TGAT TCAAATA777 GAC7CCGCAG 
CGAGCCGG77 TGCAGAGTGG TCACCTGCCC TGCTCCACAC CCACCCCCGC G7CTC7TCCA 
AC7CTCAAC7 CACGATCCAG GGAAACCACC GTCCAG7GGC CATGTTTGTT CCCTGGCAAC 
7CGG7ACAAT TACCCGTCAC C G AG AT GAG C TCCAAAAACT ACTGGCAGCC 7CCCTGC7CC 
CGGAGCACCC GGAGGAGAGC C7CGGTAACC C CATAA7G AC ACAGA7TCAC CAGTCGCTCC 98 4 0 

AACCATCTTC CCCCTGCAGG GTC7GTCAGC 7CCTA7TTTC TCTGGTCCGC GA77CG7CCA 99 CO 

CCCCCATGGC- TTTC7TCGAG GACTA7GCC7 GCC7CTGC77 C77C7GTCTA TACGCCCCAC 
ACTGCTGGAC CTCGACCATG GCGGCAGCGG CAGACC7G7G CGAGATCATG CA7CTGCAC7 
TTCCAGAAGA GGAGGCGACA 7 AC G G G CT AT TCGGACCGGG TCGCCTTATG GGTATCGAC7 
TGCAGCTGCA CTT CTTTGTT C AAAAGTG CT T7AAGACCAC CGCCGCCGAA AAAA7ACTGG 1014C 
GAATATCCAA CCTGCAATTT TTAAAATCAG AATT CAT C CG GGGCATGCTC ACAGGCACCA 1 Z 2 0 C 
TCACCTGCAA CTTCTGCTTC AAAACGTCCT GGCCCAGGAC AGACAAGGAC- GAGGCCACCG 10260 
GCCCCACCCC ATGCTGCCAG ATTACAGACA CCACCACCGC ACCCGCGAGO GGCATACCGG 10220 
GGCCACATTC TGCGGCGCAA GTCGCCCCAC AAAG C C C AG 2 CTACTTCCCG 1G3SO 
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CGCTAATAGA TATCTGGTCC ACGAGCTCAG AGCTCCT7GA CGAGCCGCGG CCTCGACTGA 1-4 

TCGCAAGCGA CATGAGTGAA CTCAAATCCG TGGTCGCATC CCACGATCCG TTZZTZZZZ 2 

CCCCGCTTCA G G C AG A C AC C TCACAGGGTC CATGTCTGAT GCACCCAACC CT3GG3CTAr 

GATACAAAAA CGGGACTGCA TCCGTCTGCC TCCTCTGCGA GTGCCTTGCG GCACA2CCAG 

AGGCACCCAA GGCGCTGCAG ACCCTTCAGT GCGAGGTAAT G G G C GAT AT A G AAAA 3 AA 3 3 

TAAAGCTGGT AGACAGAATT GGCTTTGTGT TGGACAACCC ATTCGCCATG C CAT AT G TAT 

CAGATCCGCT ACTTAGAGAG CTGATCCGGG GCTGTACCCC ACAGGAAATT CACAAGCACC 

TGTTCTGCGA CCCGCTGTGC GCCCTCAATG CTAAGGTGGT G T C AG AG G A C GTACTATT33 

GCCTGCCCAG GGAGCAGGAG TATAAAAACC TCAGGGCATC CGCGGCCGCC GGA GAG GT 3 3 

TCGATGCCAA ■ CACCCTGTTC GACTGCGAGG TCGTGCAGAC TTTGGT CTTT CTCTTTAAGG 

GTCTCCAAAA CGCCAGGGTG GGGAAAACCA CCTCACTAGA CATTATTCGG GAGCTAACCG 

CACAACTAAA AAGACACCGC CTAGACCTGG CCCACCCCTC ACAGACGTCA CAGTTGTACG 

CTTGAGCTGG TCGGGGGCCT TCGCACCCCA TCCACCGATG CCGAAATCAG TGTC3AGC3A 

CAT 3 AG CTTG GCGACCTCAA CCGGTCGCAG TGGACCGCGA GACATCAGAA GATGCTTGT3 

ATCCCGCCTG CGGTGGGTCC CGZCCGGGGZ GCGAAGCGCC AGGGTCAGCA GCAAGCAGAG 

CGAGCTAGGA GTGGACTTTC TCCGTGAGAT GGAGACCCCG ATATGCACCT C GAAAA 2 AG T 
AATG3TGCG3 3TAGAGCTGT CTACCGTCGG ACCCGGCCGC TGCGTCTCCC TGTGTCCGTT 
T3GA2ACTG3 TCAAACATGG GGTTCCAGTG CGCTGTGTGC CCATGGACAG AAAATCCCAC 
C3TT3Cc::AA GG3TGG3GG2 CTCAGACAAT GGTGGGCGAT GGGCT2AAAA AAAAT AA C G A 11560 
GCTATGCT3G G TAG CGCTGG CCTTTTATCA CCACGCAGAC AAAGTGATCC AAGA3AAGAC 1-64 0 
GTTTTAG2TA TGA3TGCT3A GTCACTC2AT GGATGTGGTT CG G C AG AG CT TC3T3GAGGC 117CC 
TGGTGTA3TG TA33CTAAG2 TGGTGCTAAA AACCTTTGGG 2ACGATGC3G TACCCATCTT 11 76 0 
CACTACCAAC AACGGCATGC TAACAATGTG CATCCTTTTT AAAA3C3GGG GAGTA3AT3T 11S2C 
GGGAGAAA3T GCGCTTAGGC TGCTTATGGA TAACCTCCC2 AACTACAAGA TATCGGCGGA 11880 
CTGCTGCAGA CAGTCCTACG TGGTCAAGTT TGTCCCAACG CACCCGGACA CCG2AAGCAT 11940 
TGCAGTGCAG GTACACAC3A TATGCGAA3C GGTTGCGGCG CTAGACTGCA C3GAG3AGAT 12C0C 
GGGGGATGA2 ATTCAAAAGG GAACCGCAGT TGTCAAGGC3 GTATAAGGT3 A2AT3TA3G2 1206C 
TGTCAG323A GCTCGTATTG CAACTGACGA TGTTCAGGTG GTAATAAAGT CATTAAAC3A 22 12C 
CAAAGTGATT C TTTT AAT Z T GTTTATTGTT TTTGAACATG TGG2A3ACGC TGCAAT3TA2 
TG2CATGAAA GGTGGTT3TA TATC2ACCA2 TTGGCGTCTG 

TCATTAA2AA ACAAGGTCAA TACATTGTGA GGGAGTGTTT TTT33CATGG TAG CATTCGT 
GTGGTTTGGG AGAGCGGACG CCAT 



i : 5 i : 

1 C* i 2 - 

lC-cE : 

1074C 

loeoc 
i o s € : 
i :■ e 2 : 

1CSBC 
11C4 0 
1110C 

1 1 1 6 C 
1122C 

1 1 2 B 0 

113 4 0 
11400 

114 6 0 
1152 0 



TG2 2A3AATT 1224 0 




12 3 C 1 
11361 
1 2 4 2 C 



WO 97/27208 PCT/US97/01442 

192 



12 4 = 

12 5 4 : 



12 7 6 0 
I2B4C 
1290C 
1296C 
13C2C 

13 06 0 
1314 C 



CTTTAATGCG GAGAGGAATG GTGGCCTGGT TGACA CCGCG TGCCGGCCA7 C7GAAC7G7 
ACTGTGTTAT GAGCCACGGG TATGCCCTGG ATACGCCTG2 TCTTCAGCAT TG7ATGTGT 

TAATGT7GTG CTTGGTGCAA CCG7GATTGT GTTTTTGTA7 TT7ATTT7AC 7GACAC7C77 12 6 0C 

TGGGAGGGCA CGCTAGCTTC AG7GCGCGCC CGTTGCAACT CGTGTCCTGA A7GC7ACGGG 12 6 6: 

GCCACGCTGG CCACTCGGGG GGACAACACT AATCGCCAAC AGACAAACGA GTGGTGGTA7 12 7 2C 
CGCCCCAAGC CTCCAGCGCC ACCCATTTAG TAACACATCC GGGACATGAA CTGCCACAAA 
CACCGTTAAG CCTCTATCCA TGCATTGGGA TTGGAGTGAG GAGGGAGGAG GGCACCAGGT 
TCCCGGGGAG GAGGG CACCA GGTTCCCGGG GAGGAGGGCA CCAGGTTCCC GGGGAGGAGG 
GCACCAGGT7 CCCGGGGAGG AGGGCACCAG GTTCCCGGGG AGGAGGGCAC CAGG77CCCG 
GGGAGGAGGG CACCAGGT7C CCGGGGAGGA GGGCACCAGG TTCCCGGGGA GG?,GGGZACZ 
AGGTTCCCGG GGAGGAGGGC ACCAGGTTCC CGGGGAGGAG GGCACCAGGT TCCCGGGGAG 
GAGGG CACCA GGTTCCCGGG GAGGAGGGCA CCAGGTTCCC GGGGAGGAGG CTGG GGTGCG 

CC3CGCCGGG TTCCTGGGGT GCGCCGCGCZ GGGTTCCTGG GGTGCGCCGC GCCGGGTTCC 12 2 0 2 

TGGGGTGCGC CGCGCCGGGT TCC7GGGGTG CGCCGCGCCG GGTTCCTGGG GTGCGCCGCG 1326 C 

CCGGGTTCC7 GGGGTGCGCC GCGCCGGGTT CCTGGGGTGC GGGGTGCGGG GGACCGCGCC 13 32C 

GG3GTAC7GC AGGGTTCGCA GGGTTCGGGG GTACTACCTG GTTTCCTGGG GZGTGCCAGG 1338C 

AC3GG7TCCT GGGGTGCCAC CGCTCCTCGA TACGTGTAAA T C CAAG AG AT CCGCCCTCCC- 1344C 
TG CCGCCGCG CGCGTAATGC GCGAGGGGGG TCGGTCTCCC CTCTTCTTTA TAGCGTTTCC 
TGCGAAGGGG GCGTAACCGT AGGACAAACT GCTTATGTAG GGGTTAGCCA CCCATTTC CC 

GGGGCCGCGC CAGAGGTGAG CGTGGACCTA GCATCCCGCT CCCATTTACC G AAA CCACCC 13620 

AGAG3CGA3A TTCCAGGGCC GTGACTCACT AGCTCCCCTC CCATCGAACA ACCACGCTTG 1366C 

GCTAACACGG CTGGAGTGGC GGTGGGCGGG GCCCCTATAA TCCTGGCCCC CATCTACTGA 1374 0 

AAC G AC CC AG TAGAAAAATC CCAACCCCAT GACTCATCAG GCCCTATTAT ATAGAATATC 13B0C 

CCA3TAGAGT GACCCAGCTG GTTTCCATAA ATGGATATAC TTCCGGAAAA CGAAGGAGGG 13B6C 

TTGAATACAG TTGGGGGTAG TCCGCTGGTA TTCCCAGCTG AG GTTGCCTT ATTTGGTAA7 13 92C 

GCTTCCGGAA ATACCACCTG AGTACCCCAT TGGTTTATAC CTTGTTTAAT TGTAGAATTA 13S8C 

CAGCTGGATT TACCCAGCCG GGTTTACGCA GCTGCGTATA CCCAGC7G7G TTTACGCAGC 14 04 0 

GGGGTTTACG CAGCTGGGTA GACCCAGCTG GGTATACCTA CTGG AA TAG G GGCTGCGATG 1410C 

ACTCAGCTGC GCTAGGATTA AAGGATTATA TATATATATA TAGGAAAAAT CAAAACAAAA 14 16 0 
CTCTAATCGC TGATTGGTT C CCGCTCTGGG CCAATCAGCT TGGGAGTTCT AGGGATAGGG 
GCCAATGGGA GGCCTCCGAA TTTGATTGAC GG CTGGGGCG TCCAATGGAA TG3C3CGGT2 

GCCTAGCTC3 AACGGGATTG GTCGGCCGGA TGGGCCAATG GCGGCTCGGA AAACTTTGAT 14 34C 

TGACGGGCCG GCGGACCAAT GGGAGCGGGG CAGAGGA7TA TGGGGGATTA GCAAATTCAA 144 CC 

GATGGCGGCG CCCATGAAAT GGCCAAAAAT TATAATTTTT CGAGTCGCTC ACG3TCCCA2 14 4 6 0 
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CTAGCGGCGT GACCTGGAGG TGACCCCGTG CACCCGGGCG CTCTGAATTT 7TCTGCGCA7 1-2. 
GCGCGACTCC TCATC7ACAT AAT7TATGCA CATAAAAG3A TTAGCGCATG CAAATTAGTC 14 53: 
AGATAGCAGGGCCATCCACA CTTTATGTTG GCCGCGTGCC AGGC3CCGGC GTGGGCGCCG 
CGCGCGTGCT CTCTCAGTCG CGCCTAGCTG CTTCCAACAG ACAAAAGCGG GGCG..A3TG 14 
AGGGAGTGCG C3CGCTGCGC TGACTTGGCC GATTTCCAGT GCATGCTTTG TCACCCCAGC 1476: 
GCGAGAATGG AATTTTCATT ATTGAGCAAT TTGGGCACCC TGGGGACGAT AA C CAT A CAT 
GGATACACGG GTTCCAAATA TGCAAAGTAG ACACTAAGGT ACCATTTGGC ATATTTGG A C 
GTCCTGGGCA GGTTAGCTAC CCACCAGAAT AT AT G G G ACT CTGGGCAGGA TAGCCACCCA 1494 C 
=AATTGTTT T GCGCCCZTCT TTGGCCAGGG GACCAAGGTC GTATGGTTCG CGCTACACTA 15C0C 
AGCCCGAACG TTCAGCTTTG CGTGCTTTCG ACGTCCAGGC GGCTGGCACA CGGGCCGTGA 
GCGCCAGCAA CATGGGATCA TGGTAGTAAG ATACAG CAT A AATCCCCGTC CGGTGGCGC7 
CAACGCCAAT ATGCGCGGCT GCGTGGTATC TCATCGGTGG GCACGCGTAC GGTGGTCTCA 
TGGGTATTGG ACTTGTAGGC GAGGGGAGGC GCATACGACA AAAATTGCCG CCGTGAAGGT 
CGGGAACCCG CCCGCGCTTC CGCAAGGCAC GGGGCCGCAT CGGACACAGG CTAAGCATTA 
AGGATCATAA CACCGCCCTA GAAATGTTTA AGCTGTGACC AAAGCGAACC TCGCATGAGG 
CATACGCGAG CGTGGAGGTA GGATTCCCAA GG CT ATT GAG AGACGGTGGG TGAAATGATG 
AAGAACACAC AGAACAATAA CGGGCGACTA GATAAAAAGA CTCGCTCAAC AGCCCGAAAA 
ZZ ^ TCAGZCZ GACC3CCGA - GGATTAGGTG CTGCTGGACA AGTCTTTCTA AACCCGCGCA 
G3GTTTGTGT CGATCCAGAC GCTTACGAAC GCCCGCTTTA AAAACACTAT TCATAATTAA 156CG 
■CAGAAGTTGA CACCAGCCCG CAGTTACCCA ACCTTCTATT TTTTTGGAGT GTTGACAAGT 1566C 
TT=CATC3::c CGTTTGGCGT TTCCCGCATG GTGTCAAATT AG T G A C G C AC —TCZCZZZG 
TCACTATGGG TTTACCCTGA TTTAGTAAGT AAAACTGCCG ZZZZZGZZZ?, CTCATTTTTT 
TACCCTGTTA TTTGCTGTAT TTA CAT CT AC GGACCCCCTT TTGGTGAGAT TGCCG7GGTT 
CTAAATAACG TTGTGGTTTT CGGACCCTTT CAGGGAC CAA ATCTTTTAC3 TGTTGCCAAG 
GTAC-CATTTG CTGGACCCGC ATAGGTTTTT GTGGCACCAG GTTATGGTCT TATGAGCGGG 
CTTGACCGGC AAGTTCCAGG CATCCTAAGT GCTTGATGTA GACCCTTAGG GCACCAGGGA 
CTACCTAGGT CAAACTCCCC CTTAGT CATG ACGCC3TGCC CACGA33TTT GAGAGGCGTA 
GACATCCGTG TCGACTGCTG GACGGAGGTA GTATAATCAG CTAGGCCTCA GTATTCTATG 

, „„„„^,. --A-TC-CCG GTTCCACCAG 162 CO 

T AA C AAATG A ATGCCCTAGA GTACTGOjj. ^i^-.^o-- * 

G2GGCGTTGT GGCCACGGGC GGTTCGTCGC TTGGACCT3G AGGGGTGTCA CATTCTGTGA 15260 

CCGCGACGTT GACGTTAGAC ACACGTCGCT GCCGTCCTCA GAATGT3ATA GCCCATCACA 16320 

GGcItTGTAG CTGTTGCGTT GGTTGGGAGT TTGGGGACCA AATTTCTATA ATTGGTGTCA 16380 

CCGCGGCAGC TCTAGCCCTG GAAGATCTGG AAGCTTGCTT CAATG3CTCA GA.CGACCC3 16440 

ITT AG C3 AA3TAG ACT C ATT AT A ATCTTAATCT TAAATCT33T T3ACG3ACTT 165 CO 
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TCGCGCCGGG AACACGCAGG TGGCAGCGGA TGTGTTTTGC CCAAACACGA GGG..G=AG3 
AAACAGGTGC TGCCGGGGAT TATGTACAGC TTACACCCAG TTTCCTGTAA TCGCCCGCAT 
CCGGCCGTCC TGGGCAGCAC CGZACCZTGZ GTAAACAACC GCGTACTTTT TCCTTCrTCrC 
CCCACCCCCA CATCCTTCCT CCCACCCTGC CAGTCCAACC CGCTTCCTGT TTT ATT C G C C lc"^C 
TTCAAACAGA AGCACGCATT CTAATGATTC TTACAAAACT TGTTAGTGTT TATTAAATCA 
GATACATACA TTCTACGGAC CAAAAATTAG CAACAGCTTG TTATCTATGG TGTATGGCGA 
7AGTGTTGGG AGTGTGATGG GCCGGAAAGG TGAAGGCCCA TTAGGGTTTG CACTTGGCGC 
TGTAGGTCTA CTCTTGACAA AGATCTAAGC ATTGACATTA G G G CAT C C AC GTCAGTGGGA 
CCCAGTAGGT CTAAGTTTTC CAT A C AGTAC ACCCAGTGTA AGATGTCTGT GGTGTGCTGC 
GAGACCCTAT AGTGTCCTTG CTTAAAAATA TCAAAGACCT AATATCCCTC GCACACAGCT 
CCCCGTCTAC GTGGAGAACA GTGAGCTGAT AAGGG CTGAA ATAACTCATT GTGCCCGCTA 
GGTGGCGCTC TAAAAAACGC GGGTCTAAGT GAAGCAGGTC GCGCAAGAGG TCTCTGCGAC 
CTGCACGAAA CAGACATTCC GCTAACAGGG GAAACGTTAA CCTGCCCTCC T CCTTT AAA G 
CTCTAAGAGC TCCAATTAAT TGGGCCAGTG TGGGTTGAGG TATGAACACG TTTAGGAGGA 
ACAATACCAC TTCCCTGTCA TCCGTGCCCA GTTTCCGCGC CACCTCACAG AGAACCTCGT 
AAGTGGCCAT GGTGCCG3CT TGTATATGTG AAGGCACCGA TGTGGAAAAA CAAAGGAAAA 
■ATTTTTC CGCCCTAAAC AAAATCACAA GCTTAATAGC TGTCCAGAAT GCGCAGATCA 
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CAGATGTTAG GATCTGTTCC ACTGCCGCCT GTAGAACGGA AACATCGCAT 17 580 
CCCAATATGC TTGCCAGCTG AGGAACTACC CCACCCGAGT GGGTATCCTG CGGAATGACG 1764 0 
TT3GCAGGAA CCAACAGCGC ACAGCCTGCA GCGCTGATAA TAGAGGCGGG CAATGAGCCA 
GTCTTTGG3T CAACTAAGGC TTTTGTAATC AGGGT3TTGA CCTCGTGGTG CCAAAAGTCC 
AGGTGTTGGG AGCCCCCCAG CAATTTAAGT AACAAGAAGG AAGTGACGTC CGTCGCTAAG 
ACTGCCTCTG TTCGCCACGC CAACTTCTCA AGGAGTTCTT TCTCCTGGTC TATAAGTTCT 1788 C 
TGGCGGGAAA AGGAGTCTGC CGCGGCATAG CAAAGTGAAC TGGTAGAAAT AGGCGTGAGG 1794C 
CTTCTGAGCT TACTGGCCAC TAACAGGCAG GCGCTCCCTG TCTTTTGAAA GTGTTCTTTG 1800C 
GACACCTGCT TTATAAGTAG GAGTCTGTCC AAAAGATTAA GGGCCAACGC GACCACGTTA 1806 0 
GGTTCTAGGT TGTATTCCTG G2AAACTGAA AACATC CATG TGCCCAGTAA CTTACGCATA 1812 0 
TGCGAAGTAA GAGATTGTTG AAAGGTCCCA AATACAGAGT CAGAAGTTAA AAAGCGCGGC 1B18C 
TCAATTTCAA GAATATTGTA AAAGATCCGA TCCTCACATA GCGTGGGATC CAGAA3TCCC 1B240 
GAGGGCGGGT TATTGGCAGT T 3 C CAT AT AG AGTGGCGAGC GTATGTGGCC TACCTGTAGA 1E30C 
G 2 CTGGAGTT TCAGGGTGCT CTGT CAGGTT TTCC2ATCGA 2GACGCTGG3 CCGC3AGAGT 
AC „„ GCCG TT3TCCGT3T GTTCAGTTGA GGTAGATGGG TCGTGAGAAC ^ZTGZZCCZZ 
ACACACACCA GCACCCATGG CGCCAAATGC AAGTGCGGAG CGGCGACGGT GGCTTCTAGG 
GAGGAAAAAG GGGGAGAGGT 3TGGCTTTTA TGTCATTTC2 TGTGGAGAGT 
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7TGG777TC C C77GGCTGGG 7TAATGGCAG GGG C7TTTTA 
TAGGTTTCCT GCCAGGGGGT GACTAGCT7C CCAGGCTAGG 
AC7TG7G777 TTGTTCTGAC AA7ACACA7A TACACAA7AA 
TCGAGGG7GG GGCAAGCAGG ACACGGGGCC TGCCTTTACT 
AGA7AA7TT7 77AAG7CCG7 ATGGG7CA77 GCCCCAAAAA 
ACACT7TGGA TC7CGTC7TC CATCCTTTCC CAAAAAGCGT 
CCTAGCTTTC GCAGGACAAT CATCTATCTG TCTGTAAGGG 
TGGATGTGGC 777777GGGT GGGTAACTGG AACGCGCC7C 
GGGTGGTGAT G77C7GAG7A CATAGCGGTA 77CGCGAGAT 
GTCTGGTGTA 77A7CTCC7G GTGGGC7ACT GG CAATTTG T 
GTAATCCACT 7CCA7TTCG7 CCTCGGATGA CGACCCGTGC 
. c3TrT:::rrGC TC C7GCTG77 CCACCGGCTG CTGCTCCTGC 
CTG=TG=TC.C TGrrGTTCGA CCTCCTCTAA CTCCTGCTCT 



AACT7AACTA 
CGGGCCA777 
GTTATGGGCG 

ATCACTG CAA 
CTATAAAAGA 
ACCGGTGGTT 
A7ACGAAC7C 
GGGCCAGG77 
7CATG7G7GC 
AAGAT7A7GG 
TlTTTCCACCT 



7GGAAGA77G 



G 7 AC 777 7. . 
AC7GG777GG 
GGAAGG C 77G 

TG7G77G7GG 
G7TGG7A7CT 
CAGGTC7G7G 
GTGGG7CA7C 




CTGCTGC7C7 

TAA77CC7GC TCCTG C7CC7 CTAAC7CCTG CTCCTGCTCC 
CTC7AACTCC TGCTCC7GCT 



AAC7CCTGC7 



TCCTC7AAC7 



GCTGC7GGTG 

CCTGC7CCTG 
GCTCC7GCTC 
CC7GC7CCTC- 



CTGGTCCTCT AACTCC7GC7 CCTGCTCCTC CTGCTGC7CC 

TTCA7CC7GC TGCTGC7GC7 CATGCTGCTG CTGCTGCTCA 

CTGCTGCTGC TGCTCATC7T GCTGC7GC7G CTCATCC7GC 

CTGC7CATCC 7GC7GC7CCT GCTCATCCTG CTGCTCCTGC 

CTCCTGCTCC TCATCC7GC7 GC7GCTCATC C7GCTGC7GC 

CTCCTGCTCC TCATCCTGCT GC7GC7CA7C CTGCTGC;-j. 
CTGCTGCTGC 7CA7CCTGCT GCTGCTCATC CTGCT3C7GC 
CTGCTGCT3C TCATCCTGCT GCTGCTCATC CTGCTGCTGC 
CCGCTGCTGT GGC7CCCGCT GC7G7GGCTC CCGCTGCTGT 
CCGCTGCTGT GGCTCCCGCT 3CTGTGGCTC CCGCTGCTGG 
CCGCTGCTGT GGCTCCTGCT GCTGTGGGTC CTGCTGCTGT 
CTGCTGCTGT GGCTCCTGCT GC737GGC7C CTGCTGCTGT 
CTGCTGCTGT GGCTCCTGCT G7TG7GGCTC CTGCTGTTGT 
C7GCTGTGGC TCCTGCTGTT GTGGC7CC7G CAGGGGC7CC 



TGZTGCTZ^Z 
7CA7CC7GC7 
TCA7CC7GCT 
TCATCC7GCT 
TCATCCTGCT 



gctgctcat: 
cctgctgct. 



GC7G CT CAT! 
GC7GC7CA7: 



3GCTCCCGCT 
GGCTCCCGCT 
GGCTCCTGCT 
GGCTCCTGCT 

TGCTGCTGTG 



GC7GC7CA7-: 
3C73C7CA7-: 
'3C7GCTCA7: 
GCT3TG37T: 
3C73T3GG7: 
GCTGT33C7: 



GCTGT3GCTC 
3ZGGZZZZTZ- 
GCTCCTGCTG 
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CTGTC-GC7CZ TGCTGTTGTG GC7CCTGCAG GGGCTCCTGC TGCTGTGGZT CC7GCT3773 206. 

TGG7TCCTC-Z TGTTGTGGCT CCTGCTGCTG TTGTGAACTT TGGATGCTCA AC3TTTTG77 2 0 ~ : ■- 

TCCATCGCCZ CCGTCCTCCT CGTCZTCCTT CTZGTCCTZZ TCCTCGTZA7 CC77CTCG7Z Z0~6: 

CT CATTGT C 2 T CAT CAT CG T CATCCTCCTC GTZZTCCTZZ TZCTZGTZZT ZCTZZTZG7Z 20 = 2: 

CTCCTZCTCG TCCTCCTCCT CGTCATCCTC CTCGTZATCZ TCCTZGTZAT ZZTZZTZGZZ 2 0SSZ 

ATCCTZZTZG TCATCCTCCT CGTCATCCTC CTCGTCATCZ TCCTCGTCAT CZTCZTCGTZ Z094 C 

^„ cztz::t::g TCATCCTCCT CGTCATCCTC CTCGTCCTZZ TCATCTGTCT CCTGZTZCTZ ZIOOC 

CTZATZATZC TTATTGTCAT TGTCATCCTT GTCAACCTGA CTTTCCTTG Z TAATCTCG77 2106 0 

GTCZC2ATTA TCCTCGCCAG CCTGATTATT TTCGGAACAT TCTTTTTCAT TZTTGGATGZ 2112 0 

TTCTTCTGZA ATCTCCGCAA GG AG CACCAA CATGGCTGTG TCATCACCCZ AGGATCCCTZ 2113 0 
AGACGGGGAT GATGATCCTA TGGAGATGGG AG ATGTAG G C GGTTGGCGTG GCGGAGTATZ 
GZCATZGZTG GATGATCCCA CGTAGAT CGG GGACTCTGTG GCCCATGGGG GGTACACAZT 

AZGGTTGGZC- AAGTCACATC TAGGGGGAGA GACTGGGGGC GACTGACATA TTGGGTTTAG 2136 C 

TGTAGAGGGA ZCTTGGGGGG ACGATAGZZT TCTTTTTCTZ AGGCTACGCA GGGTAGAZGG 21420 
AGZTAAAGAZ- TZTGGTGACG AZTTGGAGGG AGGCTCGGG7 GGAGGAGTCG TGGGTGA37G 
TGGAGGTGTA GTCTGCTGCG AGGGTGGCGG ACGCATAGG7 GTTGAAGAGT CTGGZCTTZZ 

TG7AGGAZ77 GAAAGCGGTG GCCTTTGAGA AGACTCTGGA GACTGCGTGG GTGGZAA7GC 216 00 

AGGAGA73Z-A GAA7GAG7AT CCG7GGTCZC CGGAGACACA GGATGGGATG G AG G G ATT G G 21560 

GGAGGAA3A3 G7GG77ACGG GGGG7AAGAG TGCZGGTGGA GGTAAAGGTG T7GCGGGAG7 21720 
GGGTGAA33A A7GGGAGCCA CCGGTAAAGT AGGAC7AGAZ A ZAAATG CTG GCAGZZZGGA 
TGTGAACACT G7GGGAC7TC CAGG7ATAGG CAAGG7G7GG GGTCZACATT ZCCG3ZZG7Z 
C-ATGGAG7Z3 GZGACATGC7 TCCTTCGCGG 77G7AGA7G7 AGGTZATZGC ZAAGGTZAZA 

7ZTTTZC33A GAZZTGTTTZ G7TTZZ7AZA A7TTCZTCTZ GTTAAGGGZG CGCZGG7377 Z196C 

■ZZGTCCCGAZ Z7ZAGGCGZA TTCZZGGGGG CGCCATCCTC GGGAAATCTG GTCTGACAAC 22 320 



2124 0 
Z13CC 
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21540 
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2214 C 



CAAAGTAAAA TTATGGAGGZ GGTGGZAGTA TATTCACA77 ATG C AATA Z Z CGTAGTGACZ 

AZAAGGGGGA GCTZTCAGAZ AA77AAG CGG 77ACAZACAG TAGZAGGC7G CAG7ACCGZC 

ZATGGZCAZA G G ATGTAG A. T ZGZAGAZAZT GAAACGZ73A AAZAZAGZAT TAAGZTGZAA 2 220: 

TA2ZGCCGA7 GG C Z AC C AG A TGGZACGZGZ CG 2 2 AGO AAA 777AAGTCZ7 GG7GGZ77AZ 222SG 

Z7GZZAG37A AACAAGGTTA AAG7GGGTT7 GZTGGCZ77G Z377GZZA7G GA7GZ7A7Z7 

A3GZAAG7ZZ AG AT AT AT AA. TZCGGGZG7G AG AAA 2 AG AA A CGG 2 2 AATA A2ZZA7G777 

77ZGAAAAZ2 ACZACACAZC TTAACAZAAA TZATGTAZAZ ZTGGTATTAZ 7ATTTZZCA2 2244 C 

AZATZTTATA GZATTTCAAA GATAAGGGTG CZTTACGGGZ CG Z Z ZG AAA Z AAGTGGGZGG 2 2 500 

GZGZTAZTZA ZTGTTTA7AA GTZAGCCGGA CZAAGCTGZ7 GZTCTTGGGG AZG7GAZ7GZ 2 25 50 

T7ZGT2-Z-Z3Z AGZTGZCTZZ AAATGATAZA CAZATTTT77 GA77GTZZZZ- GG 23 2 23 2 27 Z2£20 
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AG73GAGG3G GGAG7TATA7 CAAGCTACT7 7CTGAT7GG7. G222CAGGCA G3A27G22A7 22- 
AAAAA3TGAA GAAGG2GTG7 CTGCTTTGCA GAA77TACCC CC=A=TG7GC 7273GG7T33 
TGGCACCGGT TCAGTG3T CC GACCTGTCGT C7GTGCTCCC C237GGA23A C 3 3C3 A3T3 Z 

^^—j. 7™A-TA' TGGGTGG237 72A727GGG7 

C777C3GGGG 727ATGTGTA o_7-TTC~- 

AGCCC7C277 GGCGCGG77G GGGGTGCCCG CG7TCAGGGG CGCATG2GGG G3777G37G2 

C2TCACCTGC G=CATCACGC CCCGTGCTGA CATAGT7AGC G7TAG27GG2 AAAAAAGG2A 229EC 

G77G22CGG7 CC7G7AAACG TCGCCACGTA CAG22ATTCA TATGGGG7GG 7GG77CAGA2 2 3 04C 

22AG7ACCG2 2ACAAGGCAA A7A7AACCTG TCCTGGGC7T TGGAACT27A G7C77G7TA7 231 CC 

C3ATAAC7T7 GCAGTGGA7G A7GAGGGCTG 77ACC7GTGT A7C77TAA27 CA777GG7GG 2316C 

CCGGCAGGTG 73ATGGACAG CGTGCCTGGA AGTGACA7G7 CCCCCTACTG GACACGTGCA 2 32 = 0 
GG7AAATAGC AGAGAAGACG CAGACACCG7 CACC7G77TG GCAAC7GGTC GCC2ACCCC2 
2AATGTCA22 7GGGCCGCAC CCTGGAACAA CGCC7C77CT AC2GAGGA3C AG77CAC7GA 

CAGTGATG37 777ACA377G CGTGGAGGAC 2GTGAGGCTG 2CG2G7GGGG A7AA7A32A3 2 34CC 

CCCAAG7GAG GGAA7A7G7C 7CA7GAG273 GGGAAATGAG AGCATATGAA 722CGGC772 2346C 
7A77CAAGG Z C2C77GGCC2 A7GAC3T7GC CGCGGCCCAG GGAA27377G 233GGG77G3 
( ^„ CT=TG 3— GGG2C7A7 TTGGGATATT GGCA77ACA7 CA77GCCG2C GCAAG2AGGG 
CG373CAT2A C 37 AG7TCAG A7GACA7GGA GCG727A723 A2G2A373AC 7AGA7GGACA 

CCCCG7GAAC 73733TG377 AGC2A722G7 77C7GA7777 GACAGA2AA2 A27AC7A7G7 2 3 7 0 C 

CCCAAAGAC7 3777TTTA2A G7GCGA7GGC C7773AGGG7 72277GAG7G 737A3373G7 2376C 

CCCG7G372A 77G7G7GG77 7GGGA37GA2 77C722A777 7337372323 7777333777 23E2C 

TGCC2T3C23 22A322AACG 7GGA7CA7A7 7377723237 CAG^GAv*^ - _36c. 

GG A 2 A G AAA 3 37CACC733C Z 2 AAA C G G AG GA7737A337 33G73732A" 77A77AGAG3 2 3 94C 

TTGG737377 3AAGGAC3GA 73AGG2GGGG A3GAGGGGG7 G3G3GAGA77 7AC732A33A 240CG 

2TAGG77A33 7TGAAAG 3 C 3 GGG7AAAAGG 23733 27AAA 2AA2A337A7 A77A277377 2 4 06C 

A7TG7AGG27 ATGGCGGC2G AGGA777C37 AAC2A72773 77A3A73A73 A73AA72273 2412C 

3AATGAAA77 CTAAA7A7GA G3GGA7A73A C7A3777G3A AA2772A373 7AGAA373A3 2416C 
23737G7GA3 A7GAC2A2C3 7G3732377A 3A33733AA2 G77G3AA7A2 7277:-. 

77722TCA7A AATG77377G GAAA7GGA77 G372A227A2 A77777732A A32A333A72 24 3 0: 
G2G3GCAGGA G3GA7AGA7A 7A77G37327 GGG7A7273C 27AAA27232 7G737777AG 
2A7ATC737A T73GCAGAAG 7G7TGA7377 7773777222 AA7A72A777 22A2A332T7 

2A27A777 A7A73737A3 7733A7A737 72A3737737 244 6-i 

2454^ 



222SC 
2334C 
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2 3 5 B C 
2364C 



2436C 
2442 1 



G73CA3A27T 3AAA777777 

G73C372AG7 37AGTGAGGT AC77337G37 33CA7A7727 A232377237 G3272AA. 

G2AG72C772 3GA7GGG7A2 7GA2A7G7G3 732A37377A A7732A7733 7327373GG3 246C, 

GGA7GCC737 CGACACAGGA 3CA33G7GG7 CGA232G372 A32AA32A33 22A737377A 2465: 
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7GAGAACGCG GGAAACA7GA C7GCAGACTG GCGAC7GCA7 GTCAGAACCG 7G7CAG77A2 
7GCAGG777C CT3TTACCCC 7GGCCCTCC7 T ATT CT G 77T 7A7GC7CTCA OT3GTG7G7 
GG7GAGGAGG ACAAAGC7GC AAGCCAGGCG GAAGGTAAGG GGGG7GA77G TTGrTGTGGT 
GC7GC7G777 777G7G7TTT GG7TCCC77A CCACG7AC7A AA7C7AC7GG ACAC7C7GC7 24 9. 
AAGGCGACGC TGGATCCGGG ACAGC7GC7A 7ACGCGGGGG 7TGA7AAACG 7GGG7C7GGC 24 95 
AG7AACC7CG 7TAC7GCAGG CAC7GTACAG CGCCG7GG77 CCCC7GA7A7 AC7CC7GCC7 
GGGA7CCC7C 777 AGG CAG A GGATG7ACGG 7CTC77CCAA AGCC7CAGGC AG7C777CA7 
G7CCGGCGCC ACCACG7AGC CCGCGGATGT GTACG7GCGC 77CCCCC77A A777AA7C7A 
GCC7CCCGTT CCCA7GA7GC AGAGAGGCGA A7T7GG777G 7ACACAGA7G 7GAC7A7G7A 252C 
777G7T77A7 7A7GCGA77A AA7GAGGGG7 C7GA7CCCAA AAGCAA7G77 7AG7GG7GG7 2 526 
CG77GATG77 CTTGACGCTC CA7AGGTAGA 77GAC7GGAA CGCCATGGCC CACGG 3GACA 
7GGACAGGGG 7G77AGG7C7 GG7GGAACA7 GC7GCCAC7G CCACGGA7GG AACA7CAGA3 
A7GGG7C7AT GA7CAGGGCA GCG7G7CGCC CG7CAC7GGA 7GTAAG7CCG GCCACCG7GG 
AG77GCC7G7 GGGG777C7G GGA7AGTG7C 7GGC7GGCAG GG7C7CA7CC G CG G CA777 Z 
CA7G37AGG7 GAGGG7TA7C 7CGCC7CGC7 G7C7CAG7A7 G7AG7CGAGG GC 
CG7ACCGGAC CCCCAGG7AC 777CCC7GGG CCCAGC7GGG CAG CACCG7C CCCCGC^ 
C7CGGAGGAA AACGC7C77A 373T7C7GAG GGA7C7G7AT G 777 AG C GAG 7GGG7G7CA7 2566 
AC AG C77GG A CACG77GGTC 7CCAGG777A CCGCCCAGC3 C7GGGG7GG7 G7GGG7CCG7 2=74 
ACG7G7A7GG 7GAGGA77CC GACCGGCCCA C7ACACCCA3 33CCACCA3C A3 C7GGAAG C 2 580 
CCACC7CGCC ACAGCAGA73 GAGAA7G7G7 CGGG7C7377 7AGAAAC7C7 G7CAGGG7G3 2586: 
AGGCACA337 AGGG7CG77A CACAGCGCCA GGACCCA7CC CC7GGCGCTG GC37A3C7G3 2 5 32! 
CC7GGCAGCC 7377C7GAGA CA7G7AA7CA GACCAGAGAA CCCCGACAAG GAC737CC7C 2 5 5e; 
G _ TAA _ r - c TT==A= AG7C ACCG7GGCCA CC7CAAAGCC CG7G77C73C AACGCGGCCA 2604. 
7GAGC3CG7A CGGGGCAC7G C7ZCCAGGCA GCACCAACGZ 3GCCACACGG CGCGGGGAGG 2610: 
7GGGGCACGA AAACAGGCGC AGC7GAC7CC CAAGG CAC AT GG Z C C7TAGG C7GCCCAGG7 26 16; 
GATGC7CCAG ACGACCCAGG TCC77CC737 GCA7G7CC7C CAG7GGG73C AG3GGAGGCG 2S22< 
7CACCA3G77 CCACA77TC3 7 CAGAAAAGG AG37CCA7GA GAC77G CAAG GAAGTCAGGG 2626C 
7C7C773AAA CACAACTG7C 7CG77C7GCA AAACCG73AC G77G7TGCC7 737CCC7CGG 2634 C 
G3CCAAC337 GCCCAGTGGG TG7GCCACGC AGCG37A37C CC7GGCC3CC 
C7GACAAG7G 7ACCTG3GGC ACC7CAACCA GTGCCCCAGG GG7C7C7GAA 
C3AGCGGG77 AG337GGGCG GG7AG7GA3A GCTGCAGTCC CC7GCAGCCG 3CCA333CCA 26 52 0 
7C7CGA77GC AGA7GGGAGA AGCCCTCC37 CCCC7A7G7C G7GCCCAGA7 ACAA7GAGC2 
TC77GGACA7 CAGG7AC77A A CAAG CA7G A ACA3GCTGGC 3ACC37GGAC G 3 377 CAG A 3 
G3G37A77GG GTGCC7GGA7 3CCA3GAAGT 7GT3C7CGAA G3TGGACCC3 GC7A73AGAC 
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AGC7C7GAT7 CACGGCCAGG 7ATA7CAGGG CG77GC7772 GA==T7TA=G 7CCGGG37GA Iff: 

CCC7G7ATC7 GGA7CCG77G ACC7CGGCCC AGCTGGTAAA CACCACCGAG 77GAAGG3AA 2 S S - - 

GGACC7CCAC CGTTTCTTGC TG77GT37GA 7GCGGACA7G GCG77CCGAA AGG372GGAG 268 = : 

AGC7GGCAGC CGAGGAGATG GACAG7GCCA CTCCCAGC7C C7GGGAGAA7 7CG77GCAGr 2-4Z 
CGAAGAGGCA C7CC7G7AGG AGGCCGGC77 GG7GGTCCTG TGGAGTGGAG GCCACGG3G 2 
C AG TT AG CA C TACGT==TGG AGC7TGGACA CGGGACTGAA CATGAGG77G GTGAGAG2G7 

CGS7GA7GGC ATAGG7GGCC CCGGTGGATA CATTAGTAG Z CATC77G7AG GCC7GG72C2 Z-'.ZZ 

-■"ATGGCCAT 7GCC7GACGC CT3CACGCTG GCACTGGAAG CAGCTCC7GG GGCAGGGC" Z'ZSZ 

TCACCCAGGT C7CGAAGTCG TTGTGTAGGA GGTTGGCCA7 GGACGGAGTG ATGGGGTCCA 2~24 7 

CZGTGTCGGG CAC7C7GGGC GCGACCCTC7 CGGCCAGCAT GGACGAGTG2 AGCACGAGG7 Z~3Z)Z 

GGTAG7CTGA AACCGGTATG 7CCAGGGG7C CCACGCCAGC CTG77GGGCG ATGAGGCGG7 "36: 

TGGAGCATCG G7CCA7GTGT CGCGTAAAGA AC7CCTTGCT GCCAACCGTC GAGTGGCGAA 2^420 

G7AAC7GG7G GA77G7GGAG CCGGTGGCAA AAAGGCCCCA G7CAACA722 7CGGGG7G72 I74B0 

GCGAGACGCG GACA23A7CG GACAGCGCCA GCCAGGGGGA CGGGGGGG7G GACGAGG377 =7540 

GG7C7ACAGA GAAGACCC7C G7GG7C7CCC CGGTGAGGT2 G7CTAC7A77 C7GA7GGC7G 2 7 c' 00 

GGTGCTCCGA GG7C272GCG AG GACCG77A CCTGGCACGC GCACAGGCGC GCGG7GGGC7 27660 

GCAG7ACC7C CAACGGGG7C 7CGCCCAGA7 CCC2AGGCAC CGCGGCCGAG 777G7CAGCA 2~20 

CCGCAAACAC CAGGGAGCAA TACACG77GA GAAAGTGCTC 7GGCACCG22 GGG77GACGG 2-7BC 

CA7CCGGACC GG2CG2GGGA 7GCGCAGG2A GG7GGG7GC2 2Ar272G7CC- GG7AG777GG 2-940 

AGAGAAACAG 27CGAGGCCG G7CCGCGG2G CCAGCGCZ7G CAGG7GC27C AGCA27GGGG 27900 

CCGGGTCATG CGA727G777 AGTCZGGAGA AGA7AGGGCC C77GGCAAGC ZG 27GGACCA 2-560 

G C77CAGG37 C7CCAAGA7G CGCACCGCA7 TG7CGGAG77 G7CG 2 GA7A2 A G G 77 A G G G 7 2S02C 

AGGTGTCZGG 7CGA7ZZGTG GG77CAAACC 7GCZZAGAZA CAZCAC7G7Z 7G 77GGG 3GA 28 0B0 

7GA7CC77C7 CAGGGA3A7G CA772777GG AAG7AG7GG7 AGAGA7GGAG CAGA77GCCA 2B140 

GGGCGTTGCG AGGAG7GG7G GGGA7GG7GC GCAC2G7777 7AAGAAACC2 CCZAGGG7GG 2S20C 

GGACTCCCGC 7CCC7G2AGC A7C72GG727 GCTG7ACGCC 777GGGGAA7 A7G73AG3GA 2E2SC 

A7GGGG7G7G CGCACGGGG7 CCGAGGGCZG G7TCGG7GG7 A7A2AGGZCG Z-7GAGGGCZZ 22220 

C2TG7G7CTG 7CCGZZ7GGA AA2AGGG7GZ 7G7GAAAZAG ZAGG77G2ZA AGGCC3CGAA 2c 3S: 

TACZC7TC7G ZACG27GC7G 7GGACG7GGG 7GTA2GC7C2 G7GGA7C22G AAZGZ77G72 2B44! 

7GGCACAG77 CGAGGGCCAC CG77ZCA7GG 7GCA7C7722 2GG7A7GA2A A^o 

CCA2G77ATA A7TG7CCCGG G7TGAAGCZ7 GCACCG22AG 2GG7AGCAG2 777G722ZZA 2S5SC 

GGGA7ATCA7 AA2AGZ2TGZ A7AA7GACAT CA7777CAA7 G7G7G3C27A 327A2GGG27 2BSZZ 

GGGGACCC7C GGG2A2772C AACGCCTC37 ACGG7AC3AG G72GG7A777 737G7AAA7G 2 55 B O 
C Z G7G AT AAA. 77GAGG7GGG 7GTGG77G7A GCAGGG7G72- 7G73A7777G GAGA--^--. 
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GCCTGCCCAC TTCGACTCTA GCCCACTCCT GCAATCCTAG C7C77GCAG2 
G CT G7G77GA CAATGTTGTG GGCCGG7GG7 G CA7G77TGG CCCG7AGCCA AAGGA7AGAA 
CACG=TCG=T crCCCGTGGC ACAGACCGCC TGATGACATG GGGA7A7C2A AG GAG CG G TG 
ACAGCACAGC GAGCACCGTC TGTATTTCCA CATCCCGTCT C777CG777C 7CC77CGAAG 
TGGGAGG777 7CGGAAAG77 A7CCA7AGCA GA7AG7AG C Z 7CCGG7GC3A C2GGG7ACGA 
GAGTGAG7G7 GCCCG7ACGG C7TG7ATAAA AG7TCACAAA AG 277CC7CA 7CCGCGG7GA 
GA-CAC7C7C CAACCACAGC CCAGTGACG7 CG7AGGCCA7 GCC7AGAGGG CGZ?.ZCGC~Z 
CCGGGGA2AC CC7GTG7AG7 CAGGC7GCCG AGAAACCCG2 GAGA7C777G GGGAG7AGGA 
AG AAA 27 7 AG AA7CCCCAAA 7A7G7CGCAG 7CACAGGT7G 7CGGGCAGAG 7C7G777CCG 
C777CA7GGG A7CGACAG77 AC77G7AGCC A7G7CACTAA CC7CAAA7AC 
C7A7CGA7GG AAAAATGCTG 7GG7CCTAGG 7TAG7CCG7G GGAAACAAAA 
CA77T2A7CT GCAGGC7GAA A7GG7GGCGG A7CCAGAC7C C77AGACCAC AG77GCT 
A77AGAGA7A CC7GA77GG7 7AA7ACAAGC GGACGCACGC G77GG7GGAG GCG7G77G7C 
GCCCAAGA7A CTAGCATAGG 7GAC7G7GCG 77CGCTA7GT AG77GC7GCA 7772AAG772- 
GG7CG77AC7 7CTG7GT7GC AAACCC7TAC 7GGAGA7AA7 GCCA7G7C7G 77G7GGAAC7 
i=GC GAG7G7A7AA CATTTCTAGA 7GG7AGAGG7 GG7AAACGGC GAG2TAAA73 
;7GG GGACATA7C7 7GCC7GCA7G AGCA7G7GG7 G7G7CG7G7G G7G7A7A7A7 
TGGT _ T:rTT gttgttacat T GT7GAACGA CACAAG7C7G C7C777CGGT AGAG A7AAC 2 2 9E20 
CACCAG7A2G G777GGCGAG 7AC27AA7AA GAAAAAA7AA AA7CG77AA7 C7C7G77777 2 9880 

C AAA7 C AG 7 2 C7GAAG7AAC AC77G7AGTC CAAC2G7CAG 7GTAGAG2AG GA77AAC7TA 

A2A2AG2A72 CAGGACA7G7 CCA7GC7AAG GAAA7AAAC2 AAAG77A7G7 T7CGG777G2 

777A7GA22A GGGAG27G27 ACCCAGG7AC AAAAAA7C C7 7ACCCAAAAA 7AGAAACAGG 

AAGC2A22AG AGAG7GAAGC 777G7GAAAG C777GC2AGC AGAAGAAACA A7A7AA7AAA 3 01 8 0 

AAGCCACAG2 C7GCTAG7AA 7G77A7ACTC C27G7AAA7A AAAAA7A7GG ACAG7AA7AA 

^^-^^ CCAA7AAG7A 7G7GGAAAAA A7G7AA7G7A AACCAC7A7A C7GG7AAAAA 

CA7AC277CG 77A77GG7G7 277G77CGCG C777 AT AAA C AG7A77CC7A 77G77GTGG7 

7AG7G7AA22 AACAGTCC7C C77GTAAAAG 7AAAAATGAC A7AAGCCC2T 7AG77GAT22 

AA7C2AA7G7 CG77TCA773 T7A7AAACAA GCCGG72ATA CC7G7AA7AA 

7ACAAAA7G7 TATAA7AG7A 77GGTAA7GT 77AGT7AAGA 7AA7G7AAAC 

T2A7A7AC2A A7ATGTATGC AG777ATGCA TC27G2GA7G ATT AC A 3 AAA GG2ATGAA7G 

G3? ^ z: . z: ^ AAAAAG G22G G7GTTGCC77 G AGTATAC 27 G7AG7AAAAA A7AAATAA7A 

TT3TTGGT7G CAATGCTTAG GTGCAAGCAG A2A7AA77GC A 7 AG 2 AG7AA AAA -2 AG ACT 

ACACA7GCAG CGAGC7TGAG ACAAGGCCCA T7A727G77G 3 07 80 
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CAAAGATATG 7ATAAAAAAA ACAASGAAGA ATG7C~ 
G7G7CCAGTT GT7GTAAA7C TGCAATCCCA TTGA3AA7A7 AAG7ACr~ 
TGCACAGTAA TCCSCTATCA A7AGTGGATT 7AACGAC7C7 7AA7G77" 
3AA7GGG73A AAAA3ACATA CAG3GGAA7T A Z GTTTT777 AAAAAAT7G 
7ACA7AA777 7TA77TAATA AAAAACCTT7 AGTAAAACTT ACGAG7AA77 

ATAAT ACAAACACAA ACAG7ACTCA AAGTAC7T7G AG 7 

CAAA3GCAA7 A CAT GCTAAA ACAAAAGACA AATACACGAG ACA77TAAAC 
TAGAAAGAAA T AAG 77 AAA C A777AAAAAA 73TAA.77AC CAACAA77A7 
ATG3GAGG33 AA3G7TGAAA ACGTTSTTTT 777GACT3CA CA7A7A7377 
;.AAAAAG77G G7A37AAACA C77ATGT7AC TGAGCAAAAA 7ATGGTG77T 
TAGTTAAAA3 ACAAAACATA A7AGA3AAAC ACCCACAACA TG77A7AA37 
AAG7AGGCGA CA3G7A77TT 77GTAAT7CA 7TGTAGACAA AAAGCCCAAG 
GAAGTG 3 AC A AAAGAAATA7 GTAA7TAAG7 G7AG77G3AC AAG3AA77AT 

_._ -.--A'-AGAAC CA3ACA7CGT A77777G777 3GAAACC7AA 

_ Q „„_. T AAAATG3CAC A3C73GAAAA AGC73A7AA7 37AACA77G3 
AGG3GAAAAA 7G7AATAAA7 T7TACAGACA G7777GCC7A 
AGGCAC7AAG GG77T7TT7G CGAAAGGAAA AATGCGCCCG 
GGAAAGG33G GA7GG3G73A 73333GAA7G G7G3GAAAGG GG33A7GGG3 
ATG3TG3G AA A3GGG73A7G GGG7GA7GGG GGAA7GGGGC- GAAA3GGG3A 
A3GGGGAA7 3 3GG33AAAGG 333AA73GGG GGAAAGGG3C- GA7G33G33A 
GGG33GAAAG G3GGAA7GGC- G3GAAA333G 33A7G33GG3 AAAG3333AA 
G33G33A7G3 G3G3AAACSS G3GA7GG33G GAAAGGGGSS A73333333A 
GG3GGG3AAA GGG3GGAT3G G3GG3AAAGG GGGGA7GGGG GGGAAAG3GG 
G3GGGGG3GC- AGGG3GAA33 GG3TGAAG33 GGAAGG333G AGGCGAA 
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What is -1 aimed is : 

1. An isolated nucleic acid encoding a Kaposi's 
sarcoma -associated herpesvirus { KSKV) 

polypeptide - 



iC 



2 . 



The isolated nucleic acid of claim 1 wherein the 
polypeptide is selected from the group consisting 



a 
b 
c 
d 



Thymidylate synthase (TS); 
Viral protein kinase; 
Alkaline exonuclease (AE) ; 
Hel icase -primase , subunit 3; an: 
Uracil DNA glycosylase (UDG) - 



The isolated genomic DNA molecule ci ciaim 1 . 



The isolated cDNA molecule of claim 1 . 



20 



The isolated RNA molecule of claim 1. 



The isolated nucleic acid molecule cf claim 1 
which is labelled with a detectable marker. 



25 



The isciatec 
wherein' the 
colcrimetri c , 
iabe 1 . 



nucleic acid molecule of claim 6, 
marker is a radioactive, a 
a luminescent, or a fluorescent 



30 S . 



reel i cable vector containing the isolated 



nucleic acid mo_ecu_e 



9. A host cell containing the vector cf clai™ R 



"3 c 



10. The cell of claim S which is a eukaryotic cell 



:r.e ce.i 



if claim 5 which is a bacterial cell 
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A ciasmid, cosnid, X phage or YAC ccr..a:r.:r.g ~r.~ 
isolated nucleic acid molecule of claim _ . 

A nucleic acid molecule of a- least 14 
nucleotides capable cf specifically nycnci zing 
with the isolated nucleic acid molecule cf claim 



1C 14 . An isolated thymidyiate synthase polypeptide tne 

sequence cf which is set forth m SEC ID NO : - . 

1E Tne isolated polypeptide of claim 14, wherein the 
cclypeptide is linked to a second polypeptide to 
form a fusion protein. 

16. The fusion protein cf claim 15, wherein the 
second oolyoepc ide is beta -galactosidase . 

2 0 i". An antibody wnicr. specif ica--y omcs to tne 

oolvoeptide of claim 14. 

IS. Tne antibody of claim 17, wherein the antibody is 
c c 1 v clonal antibody. 



25 



"3 ^ 



1?. Tne antibcdy cf claim 17, wherein the antibody is 
a monoclonal antibocy. 

2C. A host cell which expresses the polypeptide of 
claim 14 . 

21. A vaccine which comprises an effective immunizing 
amount cf the polypeptide cf c_aim 14 anc a 
suitable pharmaceutical carrier. 

22. An antisense molecule capable of hybridizing to 
the isolated nucleic acid molecule cf tlaim l. 
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22. The ar.tisense molecule of claim 22, wherein tn= 
nolecule is a nucleic acid derivative. 

24 . A trinlex oligonucleotide capao^e or r.yrriaizins 
5 with a double-stranded isolated nucleic acid 

molecule of claim 1 . 

25 . A transgenic nonhuman mammal which comprises the 

isolated nucleic acid molecule of claim 1 
10 introduced into the mammal at an embryonic stage. 

26. A method of diagnosing Kaposi's sarcoma 
comprising: (a) obtaining a nucleic acid molecule 
from a tumor lesion or a suitable bodily fluid of 

15 a. subject; (b) contacting the nucleic acid 

molecule with the labelled nucleic acid molecule 
of claim 6 under hybridizing conditions; and (c) 
determining the presence cf the nucleic acid 
molecule hybridized, the presence cf which is 

20 indicative cf Kaposi's sarcoma in the subject, 

thereby diagnosing Kaposi's sarcoma. ■ 

27. The method of claim 2 6 wherein the nucleic acid 
molecule from, the tumor lesion is amplinec 

2 3 before step (b) . 

26. A method cf diagnosing a DNA virus associated 
with Kaposi's sarcoma comprising: ia: obtaining 
a suitable bodily fluid sample from a subject ; 

3C (b) contacting the suitable bodily fluid of the 

subject to a support having already bcund thereto 
a Kaposi's sarcoma antibody of claim 17, so as to 
bind Kaposi's sarcoma antibody to a specific 
Kaposi's sarcoma antigen; ic] removing unbound 

35 bodily fluid from the support; and (d) 

determir.ina the level cf Kaposi's sarcoma 
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1C 



antibody bound by the Kaposi's sarcoma ar.t ig = -- , 
thereby diagnosing Kaposi's sarcoma. 

29. A method of diagnosing a DNA virus asscciacec 
with Kaposi's sarcoma comprising: (a; obtaining 
a suitable bodily fluid sample from a sucie::; 
(b) contacting the suitable bodily fluid of the 
subject to a support having already bound thereto 
a Kaposi's sarcoma antigen encoded by tne 
isclated nucleic acid molecule of claim 1, so as 
to bind Kaposi's sarcoma antigen to a specific 
Kaposi's sarcoma antibody; 'z) removing unbound 
bodilv fluid from the support; and ic. 1 
determining the level of the Kaposi's sarcoma 

- - antiaen bound by the Kaposi's sarcoma ancioocy, 

thereby diagnosing Kaposi's sarcoma. 

30. A method of treating a subject with Kaposi's 
sarcoma comprising administering tc the subject 

20 an effective amount of an antisense mc^ecu.e or 

claim 21 under conditions sucn tnst tne antisense 
molecule selectively enters a turner ce^_ or tne 
subject, so as -to treat the sucnecc. 

25 "31. A method of treating a sub : eci with Kaposi's 
sarcoma comprising administering tc the su niece 
having a human herpesvirus - associated KS a 
pharmaceutical ly effective amount cf an antiviral 
agent in a pharmaceutical!;/ acceptable carrier, 

3 0 wherein the agent spe ci z icaiiy nines - ~ -n= 

ccivoect i de ci ciaim _4. 

32. A method cf prophylaxis or treatment fcr Kaposi's 
sarcoma ( KS : comprising administering tc a 
35 subject at ris>; fcr KS , the antibody cf claim 17 

in a charmaceut i cal ly accepcacie carrier. 
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33. A method of vaccinating a subject against 
Kaposi's sarcoma comprising administering to tne 
subject an effective amount cf the pclyp.ept ide_ 
of claim 17, and a suitable acceptable carrier, 
5 thereby vaccinating the subject. 

34 . A method of immunizing a subject against a 
disease caused by the herpesvirus associated with 
Kaposi's sarcoma which comprises administering tc 
10 the subject an effective immunizing dose of the 

vaccine of claim 21. 

35. A method of identifying a compound that inhibits 
KSHV replication in a subject which comprises: 
15 (a) expressing a KSHV enzyme in a bacterial 

auxotroph, which auxotroph is dependent on 
a product of the expressed KSHV enzyme for 
bacterial growth; 
(b) administering the compound to the auxotroph; 
20 (c) measuring bacterial growth; ana 

(a) comparing bacterial growth in step ic) with 
chat of the auxotroph in the absence of the 
compound so as to identify a compound tha: 
inhibits KSHV replication in the subject. 

The method cf claim 35, wherein the KSHV enzyme 
comprises one from the list as set forth in claim 
1 . 

The method of clairr. 35, wherein the KSHV enzyme 
comprises thymidylate synthase . 



25 



D 
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FIG. 2B 

Nde II 
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